Apache Airflow その1 インストールのつづき

 

■初期設定

前回までのコマンドを実行すると、Airflowは$AIRFLOW_HOMEフォルダーを作成し、「airflow.cfg」ファイルをデフォルトで配置します。 $AIRFLOW_HOME/airflow.cfgで、または[管理]-> [構成]メニューのUIを使用して、ファイルを検査できます。 WebサーバーのPIDファイルは、$AIRFLOW_HOME/airflow-webserver.pidまたはsystemdによって開始された場合は/run/airflow/webserver.pidに保存されます。(Quick Start参照)

→今回はrootでairflowを実行してしまったので~root配下に$AIRFLOW_HOMEディレクトリが作成されてしまった。

 

 

いくつかのタスクインスタンスをトリガーするいくつかのコマンドを次に示します。下記のコマンドを実行すると、example1DAGでジョブのステータスの変化を確認できるはずです。

Quick Start参照)

・ run your first task instance

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow run example_bash_operator runme_0 2015-01-01
[2020-11-04 11:21:20,589] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 11:21:20,589] {dagbag.py:417} INFO - Filling up the DagBag from /root/airflow/dags
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)
Running %s on host %s <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [None]> centos7copy
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/airflow", line 37, in <module>
    args.func(args)
  File "/root/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 76, in wrapper
    return f(*args, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/cli.py", line 579, in run
    _run(args, dag, ti)
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/cli.py", line 511, in _run
    executor.heartbeat()
  File "/root/.local/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 134, in heartbeat
    self.sync()
  File "/root/.local/lib/python3.6/site-packages/airflow/executors/sequential_executor.py", line 57, in sync
    subprocess.check_call(command, close_fds=True)
  File "/usr/lib64/python3.6/subprocess.py", line 306, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/usr/lib64/python3.6/subprocess.py", line 287, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'airflow': 'airflow'

 

[centos7copy]$ export PATH="/root/.local/lib/python3.6/site-packages/airflow/bin:$PATH"

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow run example_bash_operator runme_0 2015-01-01
[2020-11-04 11:25:24,386] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 11:25:24,387] {dagbag.py:417} INFO - Filling up the DagBag from /root/airflow/dags
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)
Running %s on host %s <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [None]> centos7copy
[2020-11-04 11:25:31,403] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 11:25:31,404] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [None]> centos7copy

 

[centos7copy]$ ll /root/airflow/
total 148
-rw-r--r--. 1 root root 38611 Nov  4 07:22 airflow.cfg
-rw-r--r--. 1 root root 92160 Nov  4 11:32 airflow.db
-rw-r--r--. 1 root root     7 Nov  4 08:12 airflow-webserver.pid
drwxr-xr-x. 6 root root  4096 Nov  4 11:20 logs
-rw-r--r--. 1 root root  2533 Nov  4 07:22 unittests.cfg

[centos7copy]$ ll /root/airflow/logs
total 12
drwxr-xr-x. 2 root root 4096 Nov  4 08:24 dag_processor_manager
drwxrwxrwx. 3 root root 4096 Nov  4 11:20 example_bash_operator
drwxr-xr-x. 4 root root 4096 Nov  4 09:00 scheduler

[centos7copy]$ ll /root/airflow/logs/dag_processor_manager
total 1960
-rw-r--r--. 1 root root 2001257 Nov  4 11:33 dag_processor_manager.log

[centos7copy]$ wc -l /root/airflow/logs/dag_processor_manager/dag_processor_manager.log
12976 /root/airflow/logs/dag_processor_manager/dag_processor_manager.log

[centos7copy]$ tail /root/airflow/logs/dag_processor_manager/dag_processor_manager.log
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_external_task_marker_dag.py                              0           0  6.47s           2020-11-04T02:35:19
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_trigger_target_dag.py                                        0           0  6.46s           2020-11-04T02:37:29
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_skip_dag.py                                                  0           0  6.46s           2020-11-04T02:37:22
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_pig_operator.py                                              0           0  6.47s           2020-11-04T02:36:05
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_python_operator.py                                         0           0  6.47s           2020-11-04T02:36:56
/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_branch_python_dop_operator_3.py                        0           0  6.46s           2020-11-04T02:37:03
/root/.local/lib/python3.6/site-packages/airflow/example_dags/subdags/subdag.py                                                    0           0  6.46s           2020-11-04T02:36:11
================================================================================
[2020-11-04 11:37:35,811] {dag_processing.py:1312} INFO - Finding 'running' jobs without a recent heartbeat
[2020-11-04 11:37:35,811] {dag_processing.py:1316} INFO - Failing jobs without heartbeat after 2020-11-04 02:32:35.811714+00:00

 

[centos7copy]$ ll /root/airflow/logs/example_bash_operator
total 4
drwxrwxrwx. 3 root root 4096 Nov  4 11:20 runme_0

[centos7copy]$ ll /root/airflow/logs/example_bash_operator/runme_0/
total 4
drwxrwxrwx. 2 root root 4096 Nov  4 11:20 2015-01-01T00:00:00+00:00

[centos7copy]$ ll /root/airflow/logs/example_bash_operator/runme_0/2015-01-01T00\:00\:00+00\:00/
total 8
-rw-rw-rw-. 1 root root 5039 Nov  4 11:26 1.log

[centos7copy]$ wc -l /root/airflow/logs/example_bash_operator/runme_0/2015-01-01T00\:00\:00+00\:00/1.log
33 /root/airflow/logs/example_bash_operator/runme_0/2015-01-01T00:00:00+00:00/1.log

[centos7copy]$ cat /root/airflow/logs/example_bash_operator/runme_0/2015-01-01T00\:00\:00+00\:00/1.log
[2020-11-04 11:20:26,219] {logging_mixin.py:112} INFO - Sending to executor.
[2020-11-04 11:20:26,220] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:20:26,221] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:21:11,199] {logging_mixin.py:112} INFO - Sending to executor.
[2020-11-04 11:21:11,200] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:21:11,201] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:21:26,990] {logging_mixin.py:112} INFO - Sending to executor.
[2020-11-04 11:21:26,990] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:21:26,991] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:25:29,180] {logging_mixin.py:112} INFO - Sending to executor.
[2020-11-04 11:25:29,181] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:25:29,182] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py']
[2020-11-04 11:25:47,482] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [None]>
[2020-11-04 11:25:47,489] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [None]>
[2020-11-04 11:25:47,489] {taskinstance.py:880} INFO -
--------------------------------------------------------------------------------
[2020-11-04 11:25:47,489] {taskinstance.py:881} INFO - Starting attempt 1 of 1
[2020-11-04 11:25:47,489] {taskinstance.py:882} INFO -
--------------------------------------------------------------------------------
[2020-11-04 11:25:47,497] {taskinstance.py:901} INFO - Executing <Task(BashOperator): runme_0> on 2015-01-01T00:00:00+00:00
[2020-11-04 11:25:47,499] {standard_task_runner.py:54} INFO - Started process 119493 to run task
[2020-11-04 11:25:47,518] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-01T00:00:00+00:00', '--job_id', '3', '--pool', 'default_pool', '--raw', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpotct5i3g']
[2020-11-04 11:25:47,519] {standard_task_runner.py:78} INFO - Job 3: Subtask runme_0
[2020-11-04 11:25:53,583] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: example_bash_operator.runme_0 2015-01-01T00:00:00+00:00 [running]> centos7copy
[2020-11-04 11:25:59,684] {bash_operator.py:113} INFO - Tmp dir root location:
 /tmp
[2020-11-04 11:25:59,685] {bash_operator.py:136} INFO - Temporary script location: /tmp/airflowtmpkpo23qoz/runme_0_3w_kkxh
[2020-11-04 11:25:59,685] {bash_operator.py:146} INFO - Running command: echo "example_bash_operator__runme_0__20150101" && sleep 1
[2020-11-04 11:25:59,690] {bash_operator.py:153} INFO - Output:
[2020-11-04 11:25:59,692] {bash_operator.py:157} INFO - example_bash_operator__runme_0__20150101
[2020-11-04 11:26:00,695] {bash_operator.py:161} INFO - Command exited with return code 0
[2020-11-04 11:26:00,704] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=example_bash_operator, task_id=runme_0, execution_date=20150101T000000, start_date=20201104T022547, end_date=20201104T022600
[2020-11-04 11:26:05,759] {local_task_job.py:102} INFO - Task exited with return code 0

 

[centos7copy]$ ls -lR /root/airflow/logs/scheduler/
/root/airflow/logs/scheduler/:
total 8
drwxr-xr-x. 2 root root 4096 Nov  4 07:22 2020-11-03
drwxr-xr-x. 2 root root 4096 Nov  4 09:00 2020-11-04
lrwxrwxrwx. 1 root root   39 Nov  4 09:00 latest -> /root/airflow/logs/scheduler/2020-11-04

/root/airflow/logs/scheduler/2020-11-03:
total 0

/root/airflow/logs/scheduler/2020-11-04:
total 0

 

[centos7copy]$ file /root/airflow/airflow.db
/root/airflow/airflow.db: SQLite 3.x database

 

[centos7copy]$ wc -l /root/airflow/airflow.cfg
1073 /root/airflow/airflow.cfg

 

・ログ設定確認

[centos7copy]$ cat -n /root/airflow/airflow.cfg

      1 [core]

----(略)----

      6 # The folder where airflow should store its log files
      7 # This path must be absolute
      8 base_log_folder = /root/airflow/logs

      9
     10 # Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search.
     11 # Set this to True if you want to enable remote logging.
     12 remote_logging = False
     13
     14 # Users must supply an Airflow connection id that provides access to the storage
     15 # location.
     16 remote_log_conn_id =
     17 remote_base_log_folder =
     18 encrypt_s3_logs = False
     19
     20 # Logging level
     21 logging_level = INFO
     22
     23 # Logging level for Flask-appbuilder UI
     24 fab_logging_level = WARN
     25
     26 # Logging class
     27 # Specify the class that will specify the logging configuration
     28 # This class has to be on the python classpath
     29 # Example: logging_config_class = my.path.default_local_settings.LOGGING_CONFIG
     30 logging_config_class =

     31
     32 # Flag to enable/disable Colored logs in Console
     33 # Colour the logs when the controlling terminal is a TTY.
     34 colored_console_log = True

     35
     36 # Log format for when Colored logs is enabled
     37 colored_log_format = [%%(blue)s%%(asctime)s%%(reset)s] {%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
     38 colored_formatter_class = airflow.utils.log.colored_log.CustomTTYColoredFormatter
     39
     40 # Format of Log line
     41 log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s
     42 simple_log_format = %%(asctime)s %%(levelname)s - %%(message)s
     43
     44 # Log filename format
     45 log_filename_template = {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log
     46 log_processor_filename_template = {{ filename }}.log
     47 dag_processor_manager_log_location = /root/airflow/logs/dag_processor_manager/dag_processor_manager.log
     48
     49 # Name of handler to read task instance logs.
     50 # Default to use task handler.
     51 task_log_reader = task

----(略)----

  344 # Log files for the gunicorn webserver. '-' means log to stderr.
    345 access_logfile = -
    346
    347 # Log files for the gunicorn webserver. '-' means log to stderr.
    348 error_logfile = -

----(略)----

  385 # The amount of time (in secs) webserver will wait for initial handshake
    386 # while fetching logs from other worker machine
    387 log_fetch_timeout_sec = 5
    388
    389 # Time interval (in secs) to wait before next log fetching.
    390 log_fetch_delay_sec = 2
    391
    392 # Distance away from page bottom to enable auto tailing.
    393 log_auto_tailing_offset = 30
    394
    395 # Animation speed for auto tailing log display.
    396 log_animation_speed = 1000

----(略)----

    439 # Default setting for wrap toggle on DAG code and TI log views.
    440 default_wrap = False

----(略)----

    456 # Minutes of non-activity before logged out from UI
    457 # 0 means never get forcibly logged out
    458 force_log_out_after = 0

----(略)----

    508 # When you start an airflow worker, airflow starts a tiny web server
    509 # subprocess to serve the workers local log files to the airflow main
    510 # web server, who then builds pages and sends them to users. This defines
    511 # the port on which the logs are served. It needs to be unused, and open
    512 # visible from the main web server to connect into the workers.
    513 worker_log_server_port = 8793

----(略)----

    627 # How often should stats be printed to the logs. Setting to 0 will disable printing stats
    628 print_stats_interval = 30

    629
    630 # If the last scheduler heartbeat happened more than scheduler_health_check_threshold
    631 # ago (in seconds), scheduler is considered unhealthy.
    632 # This is used by the health check in the "/health" endpoint
    633 scheduler_health_check_threshold = 30
    634 child_process_log_directory = /root/airflow/logs/scheduler

----(略)----

    767 # Format of the log_id, which is used to query for a given tasks logs
    768 log_id_template = {dag_id}-{task_id}-{execution_date}-{try_number}

    769
    770 # Used to mark the end of a log stream for a task
    771 end_of_log_mark = end_of_log

    772
    773 # Qualified URL for an elasticsearch frontend (like Kibana) with a template argument for log_id
    774 # Code will construct log_id using the log_id template from the argument above.
    775 # NOTE: The code will prefix the https:// automatically, don't include that here.
    776 frontend =

    777
    778 # Write the task logs to the stdout of the worker, rather than the default files
    779 write_stdout = False

    780
    781 # Instead of the default log formatter, write the log lines as JSON
    782 json_format = False

----(略)----

    860 # For volume mounted logs, the worker will look in this subpath for logs
    861 logs_volume_subpath =
    862
    863 # A shared volume claim for the logs
    864 logs_volume_claim =

----(略)----

    870 # A hostPath volume for the logs
    871 # Useful in local environment, discouraged in production
    872 logs_volume_host =

・run a backfill over 2 days

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow backfill example_bash_operator -s 2015-01-01 -e 2015-01-02

[2020-11-04 12:11:16,997] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:11:16,998] {dagbag.py:417} INFO - Filling up the DagBag from /root/airflow/dags
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)
[2020-11-04 12:11:23,337] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpn52f1n_f']
[2020-11-04 12:11:23,351] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_1', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmp5k2oixew']
[2020-11-04 12:11:23,366] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_1', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpc4xyk0wj']
[2020-11-04 12:11:23,381] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_2', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpvepb8m5d']
[2020-11-04 12:11:23,396] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'runme_2', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpzopwz3b6']
[2020-11-04 12:11:23,412] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'also_run_this', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpep8bmzvt']
[2020-11-04 12:11:23,427] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'also_run_this', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpa9ioy2s1']
[2020-11-04 12:11:28,218] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_0', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpn52f1n_f']
[2020-11-04 12:11:29,256] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:11:29,256] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_0 2015-01-02T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:12:03,547] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_1', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmp5k2oixew']
[2020-11-04 12:12:04,467] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:12:04,468] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_1 2015-01-01T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:12:40,450] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_1', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpc4xyk0wj']
[2020-11-04 12:12:41,366] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:12:41,367] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_1 2015-01-02T00:00:00+00:00 [queued]> centos7copy

[2020-11-04 12:13:17,262] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_2', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpvepb8m5d']
[2020-11-04 12:13:18,211] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:13:18,211] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_2 2015-01-01T00:00:00+00:00 [queued]> centos7copy

[2020-11-04 12:13:53,825] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'runme_2', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpzopwz3b6']
[2020-11-04 12:13:54,740] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:13:54,741] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.runme_2 2015-01-02T00:00:00+00:00 [queued]> centos7copy

[2020-11-04 12:14:30,501] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'also_run_this', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpep8bmzvt']
[2020-11-04 12:14:31,464] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:14:31,464] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.also_run_this 2015-01-01T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:15:07,109] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'also_run_this', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpa9ioy2s1']
[2020-11-04 12:15:08,014] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:15:08,015] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.also_run_this 2015-01-02T00:00:00+00:00 [queued]> centos7copy

[2020-11-04 12:15:43,910] {backfill_job.py:364} INFO - [backfill progress] | finished run 0 of 2 | tasks waiting: 4 | succeeded: 8 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 4
[2020-11-04 12:15:43,924] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'run_after_loop', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmp9_8g1fjk']
[2020-11-04 12:15:43,943] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'run_after_loop', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpd_m5815p']
[2020-11-04 12:15:43,980] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'run_after_loop', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmp9_8g1fjk']
[2020-11-04 12:15:44,897] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:15:44,898] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.run_after_loop 2015-01-01T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:16:20,398] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'run_after_loop', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpd_m5815p']
[2020-11-04 12:16:21,302] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:16:21,303] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.run_after_loop 2015-01-02T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:16:57,384] {backfill_job.py:364} INFO - [backfill progress] | finished run 0 of 2 | tasks waiting: 2 | succeeded: 10 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-11-04 12:16:57,397] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'run_this_last', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpmu_bn0q5']
[2020-11-04 12:16:57,415] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'example_bash_operator', 'run_this_last', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmplauu48v2']
[2020-11-04 12:16:57,427] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'run_this_last', '2015-01-01T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmpmu_bn0q5']
[2020-11-04 12:16:58,329] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:16:58,329] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.run_this_last 2015-01-01T00:00:00+00:00 [queued]> centos7copy

[2020-11-04 12:17:33,991] {sequential_executor.py:54} INFO - Executing command: ['airflow', 'run', 'example_bash_operator', 'run_this_last', '2015-01-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py', '--cfg_path', '/tmp/tmplauu48v2']
[2020-11-04 12:17:34,952] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 12:17:34,953] {dagbag.py:417} INFO - Filling up the DagBag from /root/.local/lib/python3.6/site-packages/airflow/example_dags/example_bash_operator.py
Running %s on host %s <TaskInstance: example_bash_operator.run_this_last 2015-01-02T00:00:00+00:00 [queued]> centos7copy
[2020-11-04 12:18:10,572] {dagrun.py:320} INFO - Marking run <DagRun example_bash_operator @ 2015-01-01T00:00:00+00:00: backfill_2015-01-01T00:00:00+00:00, externally triggered: False> successful
[2020-11-04 12:18:10,582] {dagrun.py:320} INFO - Marking run <DagRun example_bash_operator @ 2015-01-02 00:00:00+00:00: backfill_2015-01-02T00:00:00+00:00, externally triggered: False> successful
[2020-11-04 12:18:10,588] {backfill_job.py:364} INFO - [backfill progress] | finished run 2 of 2 | tasks waiting: 0 | succeeded: 12 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-11-04 12:18:10,589] {backfill_job.py:813} INFO - Backfill done. Exiting.

[centos7copy]$ ll /root/airflow/
total 176
-rw-r--r--. 1 root root  38611 Nov  4 07:22 airflow.cfg
-rw-r--r--. 1 root root 122880 Nov  4 12:19 airflow.db
-rw-r--r--. 1 root root      7 Nov  4 08:12 airflow-webserver.pid
drwxr-xr-x. 6 root root   4096 Nov  4 11:20 logs
-rw-r--r--. 1 root root   2533 Nov  4 07:22 unittests.cfg

[centos7copy]$ ll /root/airflow/logs/
total 12
drwxr-xr-x. 2 root root 4096 Nov  4 08:24 dag_processor_manager
drwxrwxrwx. 8 root root 4096 Nov  4 12:16 example_bash_operator
drwxr-xr-x. 4 root root 4096 Nov  4 09:00 scheduler

 

■Web serverにアクセス

・「example_bash_operator」DAGをクリック 

 

 

 

■公式サイト

https://airflow.apache.org

■公式ドキュメント

Apache Airflow Documentation

Quick Start

Installation

Tutorial

How-to Guides

Concepts

■概要

・airflowとは(上記の②)

Airflowは、ワークフローをプログラムで作成、スケジュール、および監視するためのプラットフォーム。

ワークフローがコードとして定義されると、ワークフローはより保守性、バージョン管理性、テスト性、が改善されより協調的に動作するようになります。

・特徴(上記の②)

- Dynamic:エアフローパイプラインはコード(Python)として構成され、動的なパイプライン生成を可能にします。これにより、パイプラインを動的にインスタンス化するコードを記述できます。

- Extensible:独自の演算子、エグゼキュータを簡単に定義し、ライブラリを拡張して、環境に適した抽象化レベルに適合させます。

- Elegant:airflowパイプラインは無駄がなく、明確です。スクリプトのパラメーター化は、強力なJinjaテンプレートエンジンを使用してAirflowのコアに組み込まれています。

- Scalable:Airflowはモジュラーアーキテクチャを備えており、メッセージキューを使用して任意の数のワーカーを調整します。気流は無限に拡張する準備ができています。

 

 

■インストール準備

[centos7copy]$ yum install -y python-setproctitle

 

■インストール

Quick Start

Installation

 
上記の③に書いてあることをそのままやるぷに

[centos7copy]$ export AIRFLOW_HOME=~/airflow

何かわかんないけどairflowユーザも作成しとく

[centos7copy]$ useradd airflow

[centos7copy]$ pip3.6 install apache-airflow

----(略)----

すごくいっぱいダウンロードされたぷに

----(略)----

    compilation terminated.
    error: command 'gcc' failed with exit status 1

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-tl9zl8rd/setproctitle/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-9b3zohi3-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in 

----(略)----

この間GNU公式サイトからダウンロードしてinstallしたgccがだめらしい。

CentOS7公式リポジトリにマージされてる方のgccを使うようにPATHを変える

[centos7copy]$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/oracle/product/18c/dbhome_1/bin:/root/bin

GNU公式サイトのgccは/usr/local/binに居るので

[centos7copy]$ ll /usr/local/bin/gcc
-rwxr-xr-x. 3 root root 2859648 Oct 31 00:32 /usr/local/bin/gcc

PATH環境変数を一時的にデフォルトに戻す

[centos7copy]$ export PATH="/usr/sbin:/usr/bin"

再度実行

[centos7copy]$ pip3.6 install apache-airflow

→同じエラー。なんでだろう。gccでビルドがこけてるっぽいが、python-develが要るのでは?

(python3のヘッダファイルが必要なので)

[centos7copy]$ yum install python3-devel

再度実行

[centos7copy]$ pip3.6 install apache-airflow

Collecting apache-airflow
  Using cached https://files.pythonhosted.org/packages/36/07/368cf47f06564d7ffff603ade4c60039ecf3f5b368b75201f4ccb5512d78/apache_airflow-1.10.12-py2.py3-none-any.whl
Requirement already satisfied: thrift>=0.9.2 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: croniter<0.4,>=0.3.17 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: tenacity==4.12.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: jinja2<2.12.0,>=2.10.1 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: configparser<3.6.0,>=3.5.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: pygments<3.0,>=2.0.1 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: jsonschema~=3.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: python-nvd3~=0.15.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: cattrs~=1.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask<2.0,>=1.1.0 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: werkzeug<1.0.0 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: gunicorn<21.0,>=19.5.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: typing-extensions>=3.7.4; python_version < "3.8" in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Collecting psutil<6.0.0,>=4.2.0 (from apache-airflow)
  Using cached https://files.pythonhosted.org/packages/33/e0/82d459af36bda999f82c7ea86c67610591cf5556168f48fd6509e5fa154d/psutil-5.7.3.tar.gz
Requirement already satisfied: python-slugify<5.0,>=3.0.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Collecting pendulum==1.4.4 (from apache-airflow)
  Using cached https://files.pythonhosted.org/packages/30/47/02f04abed54918d2a3f1da602a8254247670b2e1a99b4b1f02734a27e71e/pendulum-1.4.4-cp36-cp36m-manylinux1_x86_64.whl
Requirement already satisfied: email-validator in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Collecting pandas<2.0,>=0.17.1 (from apache-airflow)
  Using cached https://files.pythonhosted.org/packages/4d/51/bafcff417cd857bc6684336320863b5e5af280530213ef8f534b6042cfe6/pandas-1.1.4-cp36-cp36m-manylinux1_x86_64.whl
Collecting setproctitle<2,>=1.1.8 (from apache-airflow)
  Using cached https://files.pythonhosted.org/packages/5a/0d/dc0d2234aacba6cf1a729964383e3452c52096dc695581248b548786f2b3/setproctitle-1.1.10.tar.gz
Requirement already satisfied: flask-login<0.5,>=0.3 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: tzlocal<2.0.0,>=1.4 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: zope.deprecation<5.0,>=4.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask-caching<1.4.0,>=1.3.3 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: argcomplete~=1.10 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: colorlog==4.0.2 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: json-merge-patch==0.2 in ./.local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: graphviz>=0.12 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: python-daemon>=2.1.1 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: python-dateutil<3,>=2.3 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: tabulate<0.9,>=0.7.5 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask-wtf<0.15,>=0.14.2 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: unicodecsv>=0.14.1 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask-swagger<0.3,>=0.2.13 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: cached-property~=1.5 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: future<0.19,>=0.16.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: alembic<2.0,>=1.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: funcsigs<2.0.0,>=1.0.0 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask-admin==1.5.4 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: dill<0.4,>=0.2.2 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: iso8601>=0.1.12 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: sqlalchemy-jsonfield~=0.9; python_version >= "3.5" in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: markdown<3.0,>=2.5.2 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: lazy-object-proxy~=1.3 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: sqlalchemy~=1.3 in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: flask-appbuilder~=2.2; python_version >= "3.6" in /usr/local/lib64/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: attrs~=19.3 in /usr/local/lib/python3.6/site-packages (from apache-airflow)
Requirement already satisfied: six>=1.7.2 in /usr/local/lib/python3.6/site-packages (from thrift>=0.9.2->apache-airflow)
Requirement already satisfied: natsort in /usr/local/lib/python3.6/site-packages (from croniter<0.4,>=0.3.17->apache-airflow)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib64/python3.6/site-packages (from jinja2<2.12.0,>=2.10.1->apache-airflow)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /usr/local/lib/python3.6/site-packages (from jsonschema~=3.0->apache-airflow)
Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages (from jsonschema~=3.0->apache-airflow)
Requirement already satisfied: pyrsistent>=0.14.0 in /usr/local/lib64/python3.6/site-packages (from jsonschema~=3.0->apache-airflow)
Requirement already satisfied: click>=5.1 in /usr/local/lib/python3.6/site-packages (from flask<2.0,>=1.1.0->apache-airflow)
Requirement already satisfied: itsdangerous>=0.24 in /usr/local/lib/python3.6/site-packages (from flask<2.0,>=1.1.0->apache-airflow)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.6/site-packages (from python-slugify<5.0,>=3.0.0->apache-airflow)
Collecting pytzdata>=2018.3.0.0 (from pendulum==1.4.4->apache-airflow)
  Using cached https://files.pythonhosted.org/packages/e0/4f/4474bda990ee740a020cbc3eb271925ef7daa7c8444240d34ff62c8442a3/pytzdata-2020.1-py2.py3-none-any.whl
Requirement already satisfied: dnspython>=1.15.0 in /usr/local/lib/python3.6/site-packages (from email-validator->apache-airflow)
Requirement already satisfied: idna>=2.0.0 in /usr/local/lib/python3.6/site-packages (from email-validator->apache-airflow)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.6/site-packages (from pandas<2.0,>=0.17.1->apache-airflow)
Collecting numpy>=1.15.4 (from pandas<2.0,>=0.17.1->apache-airflow)
  Using cached https://files.pythonhosted.org/packages/a6/fc/36e52d0ae2aa502b211f1bcd2fdeec72d343d58224eabcdddc1bcb052db1/numpy-1.19.4-cp36-cp36m-manylinux1_x86_64.whl
Requirement already satisfied: docutils in /usr/local/lib/python3.6/site-packages (from python-daemon>=2.1.1->apache-airflow)
Requirement already satisfied: lockfile>=0.10 in /usr/local/lib/python3.6/site-packages (from python-daemon>=2.1.1->apache-airflow)
Requirement already satisfied: WTForms in /usr/local/lib64/python3.6/site-packages (from flask-wtf<0.15,>=0.14.2->apache-airflow)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/site-packages (from requests<3,>=2.20.0->apache-airflow)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/site-packages (from requests<3,>=2.20.0->apache-airflow)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/site-packages (from requests<3,>=2.20.0->apache-airflow)
Requirement already satisfied: PyYAML>=5.1 in /usr/local/lib64/python3.6/site-packages (from flask-swagger<0.3,>=0.2.13->apache-airflow)
Requirement already satisfied: Mako in /usr/local/lib/python3.6/site-packages (from alembic<2.0,>=1.0->apache-airflow)
Requirement already satisfied: python-editor>=0.3 in /usr/local/lib/python3.6/site-packages (from alembic<2.0,>=1.0->apache-airflow)
Requirement already satisfied: typing>=3.6; python_version < "3.7" in /usr/local/lib/python3.6/site-packages (from sqlalchemy-jsonfield~=0.9; python_version >= "3.5"->apache-airflow)
Requirement already satisfied: Flask-OpenID<2,>=1.2.5 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: marshmallow<3.0.0,>=2.18.0 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: marshmallow-sqlalchemy<1,>=0.16.1 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: prison<1.0.0,>=0.1.3 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: Flask-Babel<2,>=1 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: sqlalchemy-utils<1,>=0.32.21 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: apispec[yaml]<2,>=1.1.1 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: Flask-JWT-Extended<4,>=3.18 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: marshmallow-enum<2,>=1.4.1 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: PyJWT>=1.7.1 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: colorama<1,>=0.3.9 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: Flask-SQLAlchemy<3,>=2.4 in /usr/local/lib/python3.6/site-packages (from flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/site-packages (from importlib-metadata; python_version < "3.8"->jsonschema~=3.0->apache-airflow)
Requirement already satisfied: python3-openid>=2.0 in /usr/local/lib/python3.6/site-packages (from Flask-OpenID<2,>=1.2.5->flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: Babel>=2.3 in /usr/local/lib/python3.6/site-packages (from Flask-Babel<2,>=1->flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Requirement already satisfied: defusedxml in /usr/local/lib/python3.6/site-packages (from python3-openid>=2.0->Flask-OpenID<2,>=1.2.5->flask-appbuilder~=2.2; python_version >= "3.6"->apache-airflow)
Installing collected packages: psutil, pytzdata, pendulum, numpy, pandas, setproctitle, apache-airflow
  Running setup.py install for psutil ... done
  Running setup.py install for setproctitle ... done
Successfully installed apache-airflow-1.10.12 numpy-1.19.4 pandas-1.1.4 pendulum-1.4.4 psutil-5.7.3 pytzdata-2020.1 setproctitle-1.1.10

→成功したっぽい

[centos7copy]$ pip3.6 list|grep airflow
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
apache-airflow (1.10.12)

[centos7copy]$ pip3.6 show apache-airflow
Name: apache-airflow
Version: 1.10.12
Summary: Programmatically author, schedule and monitor data pipelines
Home-page: http://airflow.apache.org/
Author: Apache Software Foundation
Author-email: dev@airflow.apache.org
License: Apache License 2.0

Location: /root/.local/lib/python3.6/site-packages  ←インストール場所
Requires: lazy-object-proxy, unicodecsv, requests, alembic, funcsigs, flask-appbuilder, zope.deprecation, python-slugify, python-dateutil, markdown, python-nvd3, pygments, cattrs, tenacity, flask-wtf, python-daemon, cached-property, werkzeug, tabulate, configparser, sqlalchemy, attrs, argcomplete, tzlocal, typing-extensions, psutil, pendulum, flask-caching, flask-swagger, setproctitle, gunicorn, flask-login, graphviz, future, sqlalchemy-jsonfield, jinja2, flask, thrift, flask-admin, pandas, email-validator, dill, json-merge-patch, colorlog, jsonschema, iso8601, croniter

[centos7copy]$ ll /root/.local/lib/python3.6/site-packages
total 128
drwxr-xr-x. 27 root root  4096 Nov  4 06:11 airflow
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 apache_airflow-1.10.12.dist-info
drwxr-xr-x.  3 root root  4096 Nov  4 06:08 json_merge_patch
drwxr-xr-x.  2 root root  4096 Nov  4 06:08 json_merge_patch-0.2-py3.6.egg-info
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 kubernetes_tests
drwxr-xr-x. 17 root root  4096 Nov  4 06:11 numpy
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 numpy-1.19.4.dist-info
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 numpy.libs
drwxr-xr-x. 15 root root  4096 Nov  4 06:11 pandas
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 pandas-1.1.4.dist-info
drwxr-xr-x.  9 root root  4096 Nov  4 06:11 pendulum
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 pendulum-1.4.4.dist-info
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 psutil
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 psutil-5.7.3-py3.6.egg-info
drwxr-xr-x.  5 root root  4096 Nov  4 06:11 pytzdata
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 pytzdata-2020.1.dist-info
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 setproctitle-1.1.10-py3.6.egg-info
-rwxr-xr-x.  1 root root 58376 Nov  4 06:11 setproctitle.cpython-36m-x86_64-linux-gnu.so

[centos7copy]$ ll /root/.local/lib/python3.6/site-packages/airflow/
total 200
-rw-r--r--.  1 root root  2237 Nov  4 06:11 alembic.ini
drwxr-xr-x.  6 root root  4096 Nov  4 06:11 api
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 bin
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 config_templates
-rw-r--r--.  1 root root 31180 Nov  4 06:11 configuration.py
drwxr-xr-x. 12 root root  4096 Nov  4 06:11 contrib
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 dag
-rw-r--r--.  1 root root  2646 Nov  4 06:11 default_login.py
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 example_dags
-rw-r--r--.  1 root root  5236 Nov  4 06:11 exceptions.py
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 executors
-rw-r--r--.  1 root root    57 Nov  4 06:11 git_version
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 hooks
-rw-r--r--.  1 root root  3470 Nov  4 06:11 __init__.py
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 jobs
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 kubernetes
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 lineage
-rw-r--r--.  1 root root  3833 Nov  4 06:11 logging_config.py
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 macros
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 migrations
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 models
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 operators
-rw-r--r--.  1 root root  8165 Nov  4 06:11 plugins_manager.py
drwxr-xr-x.  2 root root  4096 Nov  4 06:11 __pycache__
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 secrets
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 security
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 sensors
-rw-r--r--.  1 root root  5410 Nov  4 06:11 sentry.py
drwxr-xr-x.  3 root root  4096 Nov  4 06:11 serialization
-rw-r--r--.  1 root root 15464 Nov  4 06:11 settings.py
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 task
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 ti_deps
-rw-r--r--.  1 root root  1207 Nov  4 06:11 typing_compat.py
drwxr-xr-x.  4 root root  4096 Nov  4 06:11 utils
-rw-r--r--.  1 root root   834 Nov  4 06:11 version.py
drwxr-xr-x.  6 root root  4096 Nov  4 06:11 www
drwxr-xr-x.  6 root root  4096 Nov  4 06:11 www_rbac

[centos7copy]$ ll /root/.local/lib/python3.6/site-packages/airflow/bin
total 120
-rwxr-xr-x. 1 root root   1305 Nov  4 06:11 airflow    ←ここに居た
-rw-r--r--. 1 root root 108769 Nov  4 06:11 cli.py
-rw-r--r--. 1 root root    811 Nov  4 06:11 __init__.py
drwxr-xr-x. 2 root root   4096 Nov  4 06:11 __pycache__

[centos7copy]$ find /root/.local/lib/python3.6/site-packages/ -type d -name bin
/root/.local/lib/python3.6/site-packages/airflow/bin

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow initdb
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/airflow", line 23, in <module>
    import argcomplete
ImportError: No module named argcomplete

[centos7copy]$ file /root/.local/lib/python3.6/site-packages/airflow/bin/airflow
/root/.local/lib/python3.6/site-packages/airflow/bin/airflow: Python script, ASCII text executable

[centos7copy]$ pip3.6 freeze | grep argcomplete
argcomplete==1.12.1

[centos7copy]$ pip3.6 show argcomplete
Name: argcomplete
Version: 1.12.1
Summary: Bash tab completion for argparse
Home-page: https://github.com/kislyuk/argcomplete
Author: Andrey Kislyuk
Author-email: kislyuk@gmail.com
License: Apache Software License
Location: /usr/local/lib/python3.6/site-packages
Requires: importlib-metadata

[centos7copy]$ cat -n /root/.local/lib/python3.6/site-packages/airflow/bin/airflow
     1  #!/usr/bin/env python  ←これ?
     2  # PYTHON_ARGCOMPLETE_OK
     3  # -*- coding: utf-8 -*-
     4  #
     5  # Licensed to the Apache Software Foundation (ASF) under one
     6  # or more contributor license agreements.  See the NOTICE file
     7  # distributed with this work for additional information
     8  # regarding copyright ownership.  The ASF licenses this file
     9  # to you under the Apache License, Version 2.0 (the
    10  # "License"); you may not use this file except in compliance
    11  # with the License.  You may obtain a copy of the License at
    12  #
    13  #   http://www.apache.org/licenses/LICENSE-2.0
    14  #
    15  # Unless required by applicable law or agreed to in writing,
    16  # software distributed under the License is distributed on an
    17  # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    18  # KIND, either express or implied.  See the License for the
    19  # specific language governing permissions and limitations
    20  # under the License.
    21  import os
    22
    23  import argcomplete
    24
    25  from airflow.configuration import conf
    26  from airflow.bin.cli import CLIFactory
    27
    28  if __name__ == '__main__':
    29
    30      if conf.get("core", "security") == 'kerberos':
    31          os.environ['KRB5CCNAME'] = conf.get('kerberos', 'ccache')
    32          os.environ['KRB5_KTNAME'] = conf.get('kerberos', 'keytab')
    33
    34      parser = CLIFactory.get_parser()
    35      argcomplete.autocomplete(parser)
    36      args = parser.parse_args()
    37      args.func(args)

[centos7copy]$ echo $PYTHONPATH

 

[centos7copy]$ export PYTHONPATH="/usr/local/lib/python3.6/site-packages:$PYTHONPATH"
もう一回試す

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow initdb
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/airflow", line 25, in <module>
    from airflow.configuration import conf
ImportError: No module named airflow.configuration

いまきづいた。airflowのシバンが「#!/usr/bin/env python」になってる。

[centos7copy]$ which python
/usr/bin/python
[centos7copy]$ ll /usr/bin/python
lrwxrwxrwx. 1 root root 7 Sep  1 02:39 /usr/bin/python -> python2

[centos7copy]$ rm /usr/bin/python
rm: remove symbolic link ‘/usr/bin/python’? y
[centos7copy]$ ln -s `which python3` /usr/bin/python

[centos7copy]$ ll /usr/bin/python
lrwxrwxrwx. 1 root root 16 Nov  4 07:22 /usr/bin/python -> /usr/bin/python3

※こういうことをするとシステム内で「#!/usr/bin/env python」や「#!/usr/bin/python」が書かれているpythonスクリプトがすべてpython3にひもづいてしまうので副作用が起こる場合がある。

ちなみに、yumコマンド(/usr/bin/yum)もpythonスクリプトで上記のシバンがあり、centos7の場合はpython2では動くがpython3では動かなかった。

 

もういっかい

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow initdb
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/airflow", line 26, in <module>
    from airflow.bin.cli import CLIFactory
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/cli.py", line 94, in <module>
    api_module = import_module(conf.get('cli', 'api_client'))  # type: Any
  File "/usr/lib64/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/root/.local/lib/python3.6/site-packages/airflow/api/client/local_client.py", line 24, in <module>
    from airflow.api.common.experimental import delete_dag
  File "/root/.local/lib/python3.6/site-packages/airflow/api/common/experimental/delete_dag.py", line 26, in <module>
    from airflow.models.serialized_dag import SerializedDagModel
  File "/root/.local/lib/python3.6/site-packages/airflow/models/serialized_dag.py", line 35, in <module>
    from airflow.serialization.serialized_objects import SerializedDAG
  File "/root/.local/lib/python3.6/site-packages/airflow/serialization/serialized_objects.py", line 28, in <module>
    import cattr
  File "/usr/local/lib/python3.6/site-packages/cattr/__init__.py", line 1, in <module>
    from .converters import Converter, GenConverter, UnstructureStrategy
  File "/usr/local/lib/python3.6/site-packages/cattr/converters.py", line 16, in <module>
    from attr import fields, resolve_types
ImportError: cannot import name 'resolve_types'

→なにこれ

https://github.com/apache/airflow/issues/11965

 

バグっぽい

----ここから上記の抜粋----

調査によると、これはPythonの依存関係が原因でした。具体的には、昨日(2020-10-29)にリリースされたcattrs == 1.1.0です。 cattrsを手動で1.0.0にダウングレードすると、問題が修正され、Airflowデータベースを初期化できるようになります。

----ここまで上記の抜粋----

 

[centos7copy]$ pip3.6 freeze |grep cattr
cattrs==1.1.0

[centos7copy]$ pip3.6 install cattrs==1.0.0
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3.6 install --user` instead.
Collecting cattrs==1.0.0
  Downloading https://files.pythonhosted.org/packages/17/5b/6afbdaeb066ecf8ca28d85851048103ac80bb169491a54a14bd39823c422/cattrs-1.0.0-py2.py3-none-any.whl
Requirement already satisfied: attrs>=17.3 in /usr/local/lib/python3.6/site-packages (from cattrs==1.0.0)
Installing collected packages: cattrs
  Found existing installation: cattrs 1.1.0
    Uninstalling cattrs-1.1.0:
      Successfully uninstalled cattrs-1.1.0
Successfully installed cattrs-1.0.0
[centos7copy]$ pip3.6 freeze |grep cattr
cattrs==1.0.0

もういっかい

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow initdb
DB: sqlite:////root/airflow/airflow.db
[2020-11-04 07:51:37,277] {db.py:378} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> e3a246e0dc1, current schema
INFO  [alembic.runtime.migration] Running upgrade e3a246e0dc1 -> 1507a7289a2f, create is_encrypted
/usr/local/lib/python3.6/site-packages/alembic/ddl/sqlite.py:44: UserWarning: Skipping unsupported ALTER for creation of implicit constraintPlease refer to the batch mode feature which allows for SQLite migrations using a copy-and-move strategy.
  "Skipping unsupported ALTER for "
INFO  [alembic.runtime.migration] Running upgrade 1507a7289a2f -> 13eb55f81627, maintain history for compatibility with earlier migrations
INFO  [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 338e90f54d61, More logging into task_instance
INFO  [alembic.runtime.migration] Running upgrade 338e90f54d61 -> 52d714495f0, job_id indices
INFO  [alembic.runtime.migration] Running upgrade 52d714495f0 -> 502898887f84, Adding extra to Log
INFO  [alembic.runtime.migration] Running upgrade 502898887f84 -> 1b38cef5b76e, add dagrun
INFO  [alembic.runtime.migration] Running upgrade 1b38cef5b76e -> 2e541a1dcfed, task_duration
INFO  [alembic.runtime.migration] Running upgrade 2e541a1dcfed -> 40e67319e3a9, dagrun_config
INFO  [alembic.runtime.migration] Running upgrade 40e67319e3a9 -> 561833c1c74b, add password column to user
INFO  [alembic.runtime.migration] Running upgrade 561833c1c74b -> 4446e08588, dagrun start end
INFO  [alembic.runtime.migration] Running upgrade 4446e08588 -> bbc73705a13e, Add notification_sent column to sla_miss
INFO  [alembic.runtime.migration] Running upgrade bbc73705a13e -> bba5a7cfc896, Add a column to track the encryption state of the 'Extra' field in connection
INFO  [alembic.runtime.migration] Running upgrade bba5a7cfc896 -> 1968acfc09e3, add is_encrypted column to variable table
INFO  [alembic.runtime.migration] Running upgrade 1968acfc09e3 -> 2e82aab8ef20, rename user table
INFO  [alembic.runtime.migration] Running upgrade 2e82aab8ef20 -> 211e584da130, add TI state index
INFO  [alembic.runtime.migration] Running upgrade 211e584da130 -> 64de9cddf6c9, add task fails journal table
INFO  [alembic.runtime.migration] Running upgrade 64de9cddf6c9 -> f2ca10b85618, add dag_stats table
INFO  [alembic.runtime.migration] Running upgrade f2ca10b85618 -> 4addfa1236f1, Add fractional seconds to mysql tables
INFO  [alembic.runtime.migration] Running upgrade 4addfa1236f1 -> 8504051e801b, xcom dag task indices
INFO  [alembic.runtime.migration] Running upgrade 8504051e801b -> 5e7d17757c7a, add pid field to TaskInstance
INFO  [alembic.runtime.migration] Running upgrade 5e7d17757c7a -> 127d2bf2dfa7, Add dag_id/state index on dag_run table
INFO  [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> cc1e65623dc7, add max tries column to task instance
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)
INFO  [alembic.runtime.migration] Running upgrade cc1e65623dc7 -> bdaa763e6c56, Make xcom value column a large binary
INFO  [alembic.runtime.migration] Running upgrade bdaa763e6c56 -> 947454bf1dff, add ti job_id index
INFO  [alembic.runtime.migration] Running upgrade 947454bf1dff -> d2ae31099d61, Increase text size for MySQL (not relevant for other DBs' text types)
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 0e2a74e0fc9f, Add time zone awareness
INFO  [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 33ae817a1ff4, kubernetes_resource_checkpointing
INFO  [alembic.runtime.migration] Running upgrade 33ae817a1ff4 -> 27c6a30d7c24, kubernetes_resource_checkpointing
INFO  [alembic.runtime.migration] Running upgrade 27c6a30d7c24 -> 86770d1215c0, add kubernetes scheduler uniqueness
INFO  [alembic.runtime.migration] Running upgrade 86770d1215c0, 0e2a74e0fc9f -> 05f30312d566, merge heads
INFO  [alembic.runtime.migration] Running upgrade 05f30312d566 -> f23433877c24, fix mysql not null constraint
INFO  [alembic.runtime.migration] Running upgrade f23433877c24 -> 856955da8476, fix sqlite foreign key
INFO  [alembic.runtime.migration] Running upgrade 856955da8476 -> 9635ae0956e7, index-faskfail
INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> dd25f486b8ea, add idx_log_dag
INFO  [alembic.runtime.migration] Running upgrade dd25f486b8ea -> bf00311e1990, add index to taskinstance
INFO  [alembic.runtime.migration] Running upgrade 9635ae0956e7 -> 0a2a5b66e19d, add task_reschedule table
INFO  [alembic.runtime.migration] Running upgrade 0a2a5b66e19d, bf00311e1990 -> 03bc53e68815, merge_heads_2
INFO  [alembic.runtime.migration] Running upgrade 03bc53e68815 -> 41f5f12752f8, add superuser field
INFO  [alembic.runtime.migration] Running upgrade 41f5f12752f8 -> c8ffec048a3b, add fields to dag
INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> dd4ecb8fbee3, Add schedule interval to dag
INFO  [alembic.runtime.migration] Running upgrade dd4ecb8fbee3 -> 939bb1e647c8, task reschedule fk on cascade delete
INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 6e96a59344a4, Make TaskInstance.pool not nullable
INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> d38e04c12aa2, add serialized_dag table
Revision ID: d38e04c12aa2
Revises: 6e96a59344a4
Create Date: 2019-08-01 14:39:35.616417
INFO  [alembic.runtime.migration] Running upgrade d38e04c12aa2 -> b3b105409875, add root_dag_id to DAG
INFO  [alembic.runtime.migration] Running upgrade 6e96a59344a4 -> 74effc47d867, change datetime to datetime2(6) on MSSQL tables
INFO  [alembic.runtime.migration] Running upgrade 939bb1e647c8 -> 004c1210f153, increase queue name size limit
INFO  [alembic.runtime.migration] Running upgrade c8ffec048a3b -> a56c9515abdc, Remove dag_stat table
INFO  [alembic.runtime.migration] Running upgrade a56c9515abdc, 004c1210f153, 74effc47d867, b3b105409875 -> 08364691d074, Merge the four heads back together
INFO  [alembic.runtime.migration] Running upgrade 08364691d074 -> fe461863935f, increase_length_for_connection_password
INFO  [alembic.runtime.migration] Running upgrade fe461863935f -> 7939bcff74ba, Add DagTags table
INFO  [alembic.runtime.migration] Running upgrade 7939bcff74ba -> a4c2fd67d16b, add pool_slots field to task_instance
INFO  [alembic.runtime.migration] Running upgrade a4c2fd67d16b -> 852ae6c715af, Add RenderedTaskInstanceFields table
INFO  [alembic.runtime.migration] Running upgrade 852ae6c715af -> 952da73b5eff, add dag_code table
INFO  [alembic.runtime.migration] Running upgrade 952da73b5eff -> a66efa278eea, Add Precision to execution_date in RenderedTaskInstanceFields table
INFO  [alembic.runtime.migration] Running upgrade a66efa278eea -> da3f683c3a5a, Add dag_hash Column to serialized_dag table
Done.

→できたげ。sqliteだけど。

 

・airflowのweb serverを起動してみる

[centos7copy]$ netstat -nap | grep 8080;echo $?
1

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow webserver -p 8080
---(略)---

=================================================================
Traceback (most recent call last):
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/airflow", line 37, in <module>
    args.func(args)
  File "/root/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 76, in wrapper
    return f(*args, **kwargs)
  File "/root/.local/lib/python3.6/site-packages/airflow/bin/cli.py", line 1177, in webserver
    gunicorn_master_proc = subprocess.Popen(run_args, close_fds=True)
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn': 'gunicorn'

→なにこれ?

https://stackoverflow.com/questions/39970104/airflow-startup-failed-due-to-gunicorn

 

gunicornにPATHを通せば解決するらしい。

[centos7copy]$ ll /usr/local/bin/gunicorn
-rwxr-xr-x. 1 root root 221 Nov  4 05:41 /usr/local/bin/gunicorn

もういっかい/usr/local/binをPATHに含める

[centos7copy]$ export PATH="/usr/local/bin:/usr/local/sbin:$PATH"

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow webserver -p 8080 &
[1] 104013
---(略)---
Logfiles: - -
---(略)---[2020-11-04 08:12:25,895] {__init__.py:50} INFO - Using executor SequentialExecutor
[2020-11-04 08:12:25,896] {dagbag.py:417} INFO - Filling up the DagBag from /root/airflow/dags
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)
/root/.local/lib/python3.6/site-packages/airflow/models/dag.py:1342: PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
  category=PendingDeprecationWarning)

→起動したくさい。

[centos7copy]$ netstat -nap | grep 8080
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      104020/gunicorn: ma

[centos7copy]$ ps -ef|grep -i airflo[w]
root     104013  91003  3 08:12 pts/1    00:00:06 python /root/.local/lib/python3.6/site-packages/airflow/bin/airflow webserver -p 8080
root     104020 104013  0 08:12 pts/1    00:00:01 gunicorn: master [airflow-webserver]
root     104130 104020  0 08:13 pts/1    00:00:00 [ready] gunicorn: worker [airflow-webserver]
root     104168 104020  1 08:14 pts/1    00:00:00 [ready] gunicorn: worker [airflow-webserver]
root     104197 104020  1 08:14 pts/1    00:00:00 [ready] gunicorn: worker [airflow-webserver]
root     104235 104020  5 08:15 pts/1    00:00:00 [ready] gunicorn: worker [airflow-webserver]
root     104254  91003  0 08:15 pts/1    00:00:00 grep --color=auto -i airflow

[centos7copy]$ pstree -p 104013

[centos7copy]$ lsof -p 104020 | egrep "[0-9]u"
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
gunicorn: 104020 root    0u   CHR   136,1       0t0       4 /dev/pts/1
gunicorn: 104020 root    1u   CHR   136,1       0t0       4 /dev/pts/1
gunicorn: 104020 root    2u   CHR   136,1       0t0       4 /dev/pts/1
gunicorn: 104020 root    5u  IPv4 1136903       0t0     TCP *:webcache (LISTEN)

---(略)---

→標準出力と標準エラー出力が端末にひもづいてる

schedulerを起動する

[centos7copy]$ /root/.local/lib/python3.6/site-packages/airflow/bin/airflow scheduler &
[1] 104921
---(略)---

→起動したくさい


[centos7copy]$ pstree -p 104921
python /root/.l(104921)---airflow schedul(104939)---airflow schedul(105047)

[centos7copy]$ lsof -p 104921 | egrep "[0-9]u"
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
python  104921 root    0u   CHR              136,6       0t0       9 /dev/pts/6
python  104921 root    1u   CHR              136,6       0t0       9 /dev/pts/6
python  104921 root    2u   CHR              136,6       0t0       9 /dev/pts/6
python  104921 root    3u  unix 0xffff9aefba9061c0       0t0 1140248 socket

[centos7copy]$ lsof -p 104939 | egrep "[0-9]u"
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
airflow 104939 root    0u   CHR              136,6       0t0       9 /dev/pts/6
airflow 104939 root    1u   CHR              136,6       0t0       9 /dev/pts/6
airflow 104939 root    2u   CHR              136,6       0t0       9 /dev/pts/6

---(略)---
[centos7copy]$ lsof -p 105047 | egrep "[0-9]u"
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.

[centos7copy]$ pstree -p 104921
python /root/.l(104921)---airflow schedul(104939)---airflow schedul(105793)
[centos7copy]$ pstree -p 104921
python /root/.l(104921)---airflow schedul(104939)---airflow schedul(105804)

 

もしかしてairflowはWeb serverとschedulerのログは標準出力と標準エラー出力をリダイレクトで拾う系?

 

 

■公式サイト

Jupyter Project Documentation

The Jupyter Notebook(公式ドキュメント)

 

■概要

Jupyter Notebook interfaceとは

 

Jupyter Notebook interfaceは、ライブコードとナラティブテキスト、方程式、および視覚化を組み合わせたドキュメントを作成するためのWebベースのアプリケーションです。

jupyter kernelとは

Jupyterにおいてカーネルとは入力されたコードをインタラクティブに処理して結果を返却するプロセス(インタプリタとかランタイム)のことを指します。Pythonの処理を実行するカーネルはIPythonが利用されますが、その他の言語を処理したい場合には、その言語のカーネルを別途インストールする必要があります。使用できるカーネルは→。jupyter kernels

 

 

■インストール

上記②のInstallationリンクをクリックする。

なんか、JupyterHub その1 インストールでJupyter Notebookをインストールしたような気がする。

 

■Starting the Notebook

上記②のStarting the Notebookリンクをクリックする。

[centos7copy]$ which jupyter
/usr/local/bin/jupyter

→居る。

[centos7copy]$ jupyter -h | grep notebook
nbextension notebook run serverextension troubleshoot trust

起動してみる。

[centos7copy]$ su dagyah -c "jupyter notebook" &

[centos7copy]$ ps -ef | grep jupyte[r]
root      50077  23782  0 01:39 pts/2    00:00:01 /usr/bin/python3 /usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py
dagyah    52644  50077  0 02:08 ?        00:00:01 /usr/bin/python3 /usr/local/bin/jupyterhub-singleuser --port=46267
root      53650  52979  0 02:19 pts/5    00:00:00 su dagyah -c jupyter notebook
dagyah    53652  53650  0 02:19 ?        00:00:00 /usr/bin/python3 /usr/local/bin/jupyter-notebook

[centos7copy]$ pstree -p 53650
su(53650)---jupyter-noteboo(53652)

[centos7copy]$ lsof -p 53652
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
COMMAND     PID   USER   FD      TYPE             DEVICE  SIZE/OFF    NODE NAME
jupyter-n 53652 dagyah  cwd       DIR                8,2      4096  786433 /root
jupyter-n 53652 dagyah  rtd       DIR                8,2      4096       2 /
jupyter-n 53652 dagyah  txt       REG                8,2     11336  950063 /usr/bin/python3.6
jupyter-n 53652 dagyah  mem       REG                8,2     22480  308116 /usr/lib64/python3.6/lib-dynload/_lsprof.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     61560  925844 /usr/lib64/libnss_files-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2     19496  308138 /usr/lib64/python3.6/lib-dynload/fcntl.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    753280  920020 /usr/lib64/libsqlite3.so.0.8.6
jupyter-n 53652 dagyah  mem       REG                8,2     85808  308130 /usr/lib64/python3.6/lib-dynload/_sqlite3.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     20064  934647 /usr/lib64/libuuid.so.1.3.0
jupyter-n 53652 dagyah  mem       REG                8,2     26008  308151 /usr/lib64/python3.6/lib-dynload/termios.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     26680  308141 /usr/lib64/python3.6/lib-dynload/mmap.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     35640  308104 /usr/lib64/python3.6/lib-dynload/_csv.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     54056   24903 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/_proxy_steerable.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     58336   24904 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/_device.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     36536   24893 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/_version.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     75464   24899 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/_poll.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     53928   24890 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/utils.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    162832   24907 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/socket.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     71272   24900 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/context.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    102000   24901 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/message.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     40896   24889 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/error.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     88776  925306 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
jupyter-n 53652 dagyah  mem       REG                8,2    991616  919592 /usr/lib64/libstdc++.so.6.0.19
jupyter-n 53652 dagyah  mem       REG                8,2   1196672   24992 /usr/local/lib64/python3.6/site-packages/pyzmq.libs/libsodium-bcf9f097.so.23.3.0
jupyter-n 53652 dagyah  mem       REG                8,2     43712  925847 /usr/lib64/librt-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2    889528   24991 /usr/local/lib64/python3.6/site-packages/pyzmq.libs/libzmq-1358af2c.so.5.2.2
jupyter-n 53652 dagyah  mem       REG                8,2     81072   24895 /usr/local/lib64/python3.6/site-packages/zmq/backend/cython/constants.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     32328  919951 /usr/lib64/libffi.so.6.0.1
jupyter-n 53652 dagyah  mem       REG                8,2    134000  308105 /usr/lib64/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    899576  308152 /usr/lib64/python3.6/lib-dynload/unicodedata.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    402000  308110 /usr/lib64/python3.6/lib-dynload/_decimal.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     38875   23836 /usr/local/lib64/python3.6/site-packages/markupsafe/_speedups.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     16400  308147 /usr/lib64/python3.6/lib-dynload/resource.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     59976  308093 /usr/lib64/python3.6/lib-dynload/_asyncio.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     17008  308120 /usr/lib64/python3.6/lib-dynload/_multiprocessing.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    174576  919648 /usr/lib64/libtinfo.so.5.9
jupyter-n 53652 dagyah  mem       REG                8,2    234720  919640 /usr/lib64/libncursesw.so.5.9
jupyter-n 53652 dagyah  mem       REG                8,2     85872  308106 /usr/lib64/python3.6/lib-dynload/_curses.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     29384 1048606 /usr/local/lib64/python3.6/site-packages/tornado/speedups.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     58024  308134 /usr/lib64/python3.6/lib-dynload/array.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    132240  308122 /usr/lib64/python3.6/lib-dynload/_pickle.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2      7448  308121 /usr/lib64/python3.6/lib-dynload/_opcode.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     51592  308115 /usr/lib64/python3.6/lib-dynload/_json.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     16752  308123 /usr/lib64/python3.6/lib-dynload/_posixsubprocess.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    118576  308131 /usr/lib64/python3.6/lib-dynload/_ssl.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    110968  308108 /usr/lib64/python3.6/lib-dynload/_datetime.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     33536  308148 /usr/lib64/python3.6/lib-dynload/select.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    106056  308129 /usr/lib64/python3.6/lib-dynload/_socket.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     20576  308124 /usr/lib64/python3.6/lib-dynload/_random.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     51320  308140 /usr/lib64/python3.6/lib-dynload/math.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     98176  308127 /usr/lib64/python3.6/lib-dynload/_sha3.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     47480  308095 /usr/lib64/python3.6/lib-dynload/_blake2.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    402384  919745 /usr/lib64/libpcre.so.1.2.0
jupyter-n 53652 dagyah  mem       REG                8,2    155744  922449 /usr/lib64/libselinux.so.1
jupyter-n 53652 dagyah  mem       REG                8,2    109976  925846 /usr/lib64/libresolv-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2     15688  920528 /usr/lib64/libkeyutils.so.1.5
jupyter-n 53652 dagyah  mem       REG                8,2     67104  931658 /usr/lib64/libkrb5support.so.0.1
jupyter-n 53652 dagyah  mem       REG                8,2    210784  921017 /usr/lib64/libk5crypto.so.3.1
jupyter-n 53652 dagyah  mem       REG                8,2     15856  921269 /usr/lib64/libcom_err.so.2.1
jupyter-n 53652 dagyah  mem       REG                8,2    967760  921023 /usr/lib64/libkrb5.so.3.3
jupyter-n 53652 dagyah  mem       REG                8,2    320720  921013 /usr/lib64/libgssapi_krb5.so.2.2
jupyter-n 53652 dagyah  mem       REG                8,2   2521144  921037 /usr/lib64/libcrypto.so.1.0.2k
jupyter-n 53652 dagyah  mem       REG                8,2    470376  921039 /usr/lib64/libssl.so.1.0.2k
jupyter-n 53652 dagyah  mem       REG                8,2     27080  308113 /usr/lib64/python3.6/lib-dynload/_hashlib.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     13608  308094 /usr/lib64/python3.6/lib-dynload/_bisect.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     26728  308136 /usr/lib64/python3.6/lib-dynload/binascii.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     52664  308132 /usr/lib64/python3.6/lib-dynload/_struct.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     13224  308139 /usr/lib64/python3.6/lib-dynload/grp.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2    157424  919770 /usr/lib64/liblzma.so.5.2.2
jupyter-n 53652 dagyah  mem       REG                8,2     38944  308117 /usr/lib64/python3.6/lib-dynload/_lzma.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     68192  919877 /usr/lib64/libbz2.so.1.0.6
jupyter-n 53652 dagyah  mem       REG                8,2     23056  308096 /usr/lib64/python3.6/lib-dynload/_bz2.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     90248  919757 /usr/lib64/libz.so.1.2.7
jupyter-n 53652 dagyah  mem       REG                8,2     34528  308154 /usr/lib64/python3.6/lib-dynload/zlib.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2     22904  308114 /usr/lib64/python3.6/lib-dynload/_heapq.cpython-36m-x86_64-linux-gnu.so
jupyter-n 53652 dagyah  mem       REG                8,2 106172832  138666 /usr/lib/locale/locale-archive
jupyter-n 53652 dagyah  mem       REG                8,2   2156240  919299 /usr/lib64/libc-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2   1136944  925841 /usr/lib64/libm-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2     14424  919333 /usr/lib64/libutil-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2     19248  925840 /usr/lib64/libdl-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2    142144  919325 /usr/lib64/libpthread-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2   3131840  950054 /usr/lib64/libpython3.6m.so.1.0
jupyter-n 53652 dagyah  mem       REG                8,2    163312  919292 /usr/lib64/ld-2.17.so
jupyter-n 53652 dagyah  mem       REG                8,2     26970  919615 /usr/lib64/gconv/gconv-modules.cache
jupyter-n 53652 dagyah    0u      CHR              136,5       0t0       8 /dev/pts/5
jupyter-n 53652 dagyah    1u      CHR              136,5       0t0       8 /dev/pts/5
jupyter-n 53652 dagyah    2u      CHR              136,5       0t0       8 /dev/pts/5

jupyter-n 53652 dagyah    3w      CHR                1,3       0t0    1028 /dev/null
jupyter-n 53652 dagyah    4u     IPv4             568845       0t0     TCP localhost:ddi-tcp-1 (LISTEN)
jupyter-n 53652 dagyah    5u     IPv6             568846       0t0     TCP localhost:ddi-tcp-1 (LISTEN)
jupyter-n 53652 dagyah    6u  a_inode               0,10         0    8534 [eventpoll]
jupyter-n 53652 dagyah    7u     unix 0xffff9aef4276ee80       0t0  568847 socket
jupyter-n 53652 dagyah    8u     unix 0xffff9aef4276f2c0       0t0  568848 socket

→jupyter notebookも標準出力と標準エラー出力が端末に紐づいてるのでjupyterhub同様、起動時に適当なファイルにリダイレクトで拾ってあげてログってことにする系くさい。

 

・ブラウザからjupyterHubにログインして、pythonの無限ループを実行してプロセスを調べる。

 

[centos7copy]$ cat helloworld.py
print("HelloWorld")
・右上の「New」プルダウンから「Python 3」をクリック

 

 

 

・無限ループプログラム実行中にプロセスの状態を見てみる。

 

[centos7copy]$ ps -ef | grep jupyte[r]
root      50077  23782  0 01:39 pts/2    00:00:01 /usr/bin/python3 /usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py
dagyah    52644  50077  0 02:08 ?        00:00:02 /usr/bin/python3 /usr/local/bin/jupyterhub-singleuser --port=46267
root      53650  52979  0 02:19 pts/5    00:00:00 su dagyah -c jupyter notebook
dagyah    53652  53650  0 02:19 ?        00:00:00 /usr/bin/python3 /usr/local/bin/jupyter-notebook
dagyah    54723  52644  0 02:32 ?        00:00:00 /usr/bin/python3 -m ipykernel_launcher -f /home/dagyah/.local/share/jupyter/runtime/kernel-40db3ab1-1742-4316-bfac-70b0a74cd2f5.json

[centos7copy]$ ps -ef | grep python[3]
root      50077  23782  0 01:39 pts/2    00:00:01 /usr/bin/python3 /usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py
dagyah    52644  50077  0 02:08 ?        00:00:03 /usr/bin/python3 /usr/local/bin/jupyterhub-singleuser --port=46267
dagyah    53652  53650  0 02:19 ?        00:00:00 /usr/bin/python3 /usr/local/bin/jupyter-notebook
dagyah    54723  52644  0 02:32 ?        00:00:01 /usr/bin/python3 -m ipykernel_launcher -f /home/dagyah/.local/share/jupyter/runtime/kernel-40db3ab1-1742-4316-bfac-70b0a74cd2f5.json

→fork元はjupyter-notebookのpid=53652ではなく、ブラウザからjupythubにログインしたときjupythubからforkしたjupyterhub-singleuserがpython無限ループプロセスの親プロセスだがや。

→jupyterhubからどうやってnotebookを起動するの?

→上記のようにjupyterhubにログインしたら勝手に起動する臭い

 

※http://127.0.0.1:8888でログイン画面まで行ける。

 

■logファイル

jupyter notebookも標準出力と標準エラー出力が端末に紐づいてるのでjupyterhub同様、起動時に適当なファイルにリダイレクトで拾ってあげてログってことにする系くさい。

 

・jupyterhubを起動

[centos7copy]$  jupyterhub -f /etc/jupyterhub_config.py &

[centos7copy]$ ps -ef | grep jupyte[r]
root       9543   4962  6 00:55 pts/1    00:00:00 /usr/bin/python3 /usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py

・起動したjupyterhubにブラウザからログイン

[centos7copy]$ ps -ef | grep jupyte[r]
root       9543   4962  2 00:55 pts/1    00:00:00 /usr/bin/python3 /usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py
dagyah     9798   9543 12 00:55 ?        00:00:00 /usr/bin/python3 /usr/local/bin/jupyterhub-singleuser --port=32753

[centos7copy]$ lsof -p 9798 | egrep [012]u
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
jupyterhu 9798 dagyah    0u      CHR              136,1       0t0        4 /dev/pts/1
jupyterhu 9798 dagyah    1u      CHR              136,1       0t0        4 /dev/pts/1
jupyterhu 9798 dagyah    2u      CHR              136,1       0t0        4 /dev/pts/1