UCL-ELEC0136: Data Acquisition and Processing Systems

Submission deadline:

Please check the Moodle page of the course.

1 Objectives

The objective of this assignment is to simulate a real-life data science scenario that aligns with the process discussed in class. This process involves:

  1. Finding and acquiring a source of data.

  2. Storing the acquired data.

  3. Cleaning and pre-processing the data.

  4. Extracting meaningful visualizations.

  5. Building a model for inference.

You are encouraged to utilize any additional methods you deem suitable for solving the problem. The assignment comprises two main deliverables:

  1. A written report presented in the format of an academic paper.

  2. The accompanying codebase to support your report.

While exchanging ideas and discussing the assignment with your peers is allowed, it is essential to emphasize that your code, experiments, and report must be the result of your individual effort.

2 Overview

Assume you are a junior Data Scientist at Money, a UK investment company and your project manager, Melanie, provides you with the following list of public companies:

• Apple Inc. (AAPL),

• Microsoft Corp. (MSFT),

• American Airlines Group Inc (AAL),

• Zoom Video Communication Inc (ZM)

You must select ONEof these companies and study their market trends to ultimately be able to advise on when and whether Money should (I) buy, (II) hold, or (III) sell this stock.

Melanie asked you to follow the company guidelines, which advise this process:

  1. Select a company and acquire stock data from the beginning of April 2019 up to the end of March 2023.

  2. Collect any other data on external events (e.g., seasonal trends, world news etc.) that might have an impact on the company's stocks.

  3. Choose the storing strategy that most efficiently supports the upcoming data analysis.

  4. Check for any missing/noisy/outlier data, and clean it, only if necessary.

  5. Process the data, extracting features that you believe are meaningful to forecast the trend of the

stock.

  1. Provide useful visualisations of the data, exploiting patterns you might find.

  2. Train a model to predict the closing stock price.

Details for each task are provided in Section 2. Details of how each task is marked are included in

Section 3.

3 Task Details

[IMPORTANT NOTE]

Tasks 1.2, 2.2, 4.2 and 6 are more advanced, but based on the scoring criteria

provided in Section 5, you can pass this assignment without solving these tasks. However, you would need

to solve these to achieve a top-distinction range.

The percentage provided on each task description is the weight of the mark in the 70% of the report, as clearly defined in Section 5.

Task 1: Data Acquisition

You will first have to acquire the necessary data to conduct your study.

Task 1.1 [5%]

One essential type of data that you will need is the stock prices for the company you have chosen, spanning from the 1st of April 2019 to the 31st of March 2023, as described in Section 1. Since these companies are public, the data is made available online. We note that any data sources are to

be accessed exclusively through a web API rather than downloading files manually. The first task is to search and collect stock prices, finding the best way to access and acquire it through a web API.

Task 1.2 [7%]

Search and collect more/different data relevant to this task. There are many valuable sources of information for analysing the stock market. In addition to time series depicting the evolution of stock prices, acquire auxiliary data that is likely to be useful for the forecast, such as:

  1. **Social Media, e.g., Twitter:**This can be used to understand the public's sentiment towards the stock market;

  2. **Financial reports:**This can help explain what kind of factors are likely to affect the stock market the most;

  3. **News:**This can be used to draw links between current affairs and the stock market;

  4. **Meteorological data:**Sometimes climate or weather data is directly correlated to some companies'

stock prices and should therefore be taken into account in financial analysis;

  1. **Others:**anything that can justifiably support your analysis.

Remember, you are looking for historical data, not live data, and that any data sources must be accessed through a web API rather than downloading files manually.

相关推荐
小龙报3 小时前
《C语言疑难点 --- 字符函数和字符串函数专题(上)》
c语言·开发语言·c++·算法·学习方法·业界资讯·visual studio
小龙报5 小时前
《数组和函数的实践游戏---扫雷游戏(基础版附源码)》
c语言·开发语言·windows·游戏·创业创新·学习方法·visual studio
小蜗的房子7 小时前
MySQL学习之SQL语法与操作
数据结构·数据库·经验分享·sql·mysql·学习方法·数据库开发
on_pluto_12 小时前
【基础复习1】ROC 与 AUC:逻辑回归二分类例子
人工智能·机器学习·职场和发展·学习方法·1024程序员节
蒙奇D索大1 天前
【数据结构】数据结构核心考点:AVL树删除操作详解(附平衡旋转实例)
数据结构·笔记·考研·学习方法·改行学it·1024程序员节
weixin_454372111 天前
0.机顶盒晶晨s905l3b芯片--刷入第三方系统+安卓9 root教程+armbian写入EMMC教程
linux·学习方法
go_bai1 天前
Linux_基础IO(2)
linux·开发语言·经验分享·笔记·学习方法·1024程序员节
Dragon_D.2 天前
排序算法大全——插入排序
算法·排序算法·c·学习方法
小龙报2 天前
《算法通关指南之C++编程篇(5)----- 条件判断与循环(下)》
c语言·开发语言·c++·算法·visualstudio·学习方法·visual studio
洛白白3 天前
Word文档中打勾和打叉的三种方法
经验分享·学习·word·生活·学习方法