Skip to content

Commit b296248

Browse files
Gouravchawla334 patch 1 (#18)
* Update airflow.md * Update airflow.md
1 parent 3f2f51c commit b296248

File tree

1 file changed

+62
-1
lines changed

1 file changed

+62
-1
lines changed

content/airflow.md

Lines changed: 62 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ It is important to remember that airflow operators can be run more than once whe
4343

4444
## Explain Airflow Architecture and its components?
4545
There are four major components to airflow.
46+
47+
[Architecture : -> ](https://medium.com/@bageshwar.kumar/airflow-architecture-a-deep-dive-into-data-pipeline-orchestration-217dd2dbc1c3)
48+
4649
+ Webserver
4750
+ This is the Airflow UI built on the Flask, which provides an overview of the overall health of various DAGs and helps visualise various components and states of every DAG. For the Airflow setup, the Web Server also allows you to manage users, roles, and different configurations.
4851
+ Scheduler
@@ -135,6 +138,64 @@ The schedule interval specifies how often each workflow is scheduled to run. '*
135138

136139
[Table of Contents](#Apache-Airflow)
137140

141+
## Understanding Cron Expression in Airflow
142+
143+
The expression `schedule_interval='30 8 * * 1-5'` is a **cron expression** used in Airflow (and Unix-like systems) to define a specific schedule for running tasks. Here's a detailed breakdown:
144+
145+
## Cron Expression Structure
146+
147+
A cron expression is composed of 5 fields separated by spaces:
148+
149+
| Field | Position | Allowed Values | Description |
150+
|---------------|----------|-------------------------|----------------------------------|
151+
| **Minute** | 1 | `0-59` | The minute of the hour |
152+
| **Hour** | 2 | `0-23` | The hour of the day |
153+
| **Day of Month** | 3 | `1-31` | The day of the month |
154+
| **Month** | 4 | `1-12` or `JAN-DEC` | The month |
155+
| **Day of Week** | 5 | `0-6` or `SUN-SAT` | The day of the week (0 = Sunday)|
156+
157+
## Detailed Explanation of `30 8 * * 1-5`
158+
159+
1. **`30` (Minute)**:
160+
- The task will run at the **30th minute** of the hour.
161+
- Example: If the hour is `8`, the task will execute at `08:30`.
162+
163+
2. **`8` (Hour)**:
164+
- The task will run during the **8th hour of the day**.
165+
- Example: It will execute at `08:30 AM`.
166+
167+
3. **`*` (Day of Month)**:
168+
- The asterisk (`*`) means "every day of the month."
169+
- Example: It doesn't matter whether it's the 1st, 15th, or 30th.
170+
171+
4. **`*` (Month)**:
172+
- The asterisk (`*`) means "every month."
173+
- Example: It will run in January, February, and so on.
174+
175+
5. **`1-5` (Day of Week)**:
176+
- The range `1-5` means the task will run on **Monday to Friday**.
177+
- Example: It skips weekends (Saturday and Sunday).
178+
179+
## When Will This Schedule Trigger?
180+
181+
This cron expression means:
182+
- **Time**: 8:30 AM.
183+
- **Days**: Monday through Friday.
184+
- **Frequency**: Daily (only on weekdays).
185+
186+
## Examples of Trigger Dates
187+
Assuming the current date is January 2025:
188+
- Monday, January 6, 2025, at 08:30 AM.
189+
- Tuesday, January 7, 2025, at 08:30 AM.
190+
- Wednesday, January 8, 2025, at 08:30 AM.
191+
- (And so on for all weekdays...)
192+
193+
## Real-World Use Case
194+
195+
You might use this schedule for tasks that should only run during business hours on workdays, such as:
196+
- Sending daily reports to a team.
197+
- Updating a database with data from the previous day.
198+
- Running data pipelines during non-peak times.
138199

139200
## How do you make the module available to airflow if you're using Docker Compose?
140201
If we are using Docker Compose, then we will need to use a custom image with our own additional dependencies in order to make the module available to Airflow. Refer to the following Airflow Documentation for reasons why we need it and how to do it.
@@ -170,4 +231,4 @@ Jinja is a templating engine that is quick, expressive, and extendable. The temp
170231
We can use XComs in Jinja templates as given below:
171232
+ SELECT * FROM {{ task_instance.xcom_pull(task_ids='foo', key='table_name') }}
172233

173-
[Table of Contents](#Apache-Airflow)
234+
[Table of Contents](#Apache-Airflow)

0 commit comments

Comments
 (0)