tests.system.apache.hive.example_twitter_dag
¶
This is an example dag for managing twitter data.
Module Contents¶
Functions¶
This task should call Twitter API and retrieve tweets from yesterday from and to for the four twitter |
|
This is a placeholder to clean the eight files. In this step you can get rid of or cherry pick columns |
|
This is a placeholder to analyze the twitter data. Could simply be a sentiment analysis through algorithms |
|
This is a placeholder to extract summary from Hive data and store it to MySQL. |
Attributes¶
- tests.system.apache.hive.example_twitter_dag.fetch_tweets()[source]¶
This task should call Twitter API and retrieve tweets from yesterday from and to for the four twitter users (Twitter_A,..,Twitter_D) There should be eight csv output files generated by this task and naming convention is direction(from or to)_twitterHandle_date.csv
- tests.system.apache.hive.example_twitter_dag.clean_tweets()[source]¶
This is a placeholder to clean the eight files. In this step you can get rid of or cherry pick columns and different parts of the text.
- tests.system.apache.hive.example_twitter_dag.analyze_tweets()[source]¶
This is a placeholder to analyze the twitter data. Could simply be a sentiment analysis through algorithms like bag of words or something more complicated. You can also take a look at Web Services to do such tasks.