python pandas处理股票量化数据:笔记4

更新日线数据到最新日期,下面是深发展(平安银行)更新到20240715以后的stock_daily表。因为积分不够,无法下载tushare.pro接口的通用复权行情数据,只能使用旧的日线数据接口pro.daily,下载的数据没有日线复权因子和均线数据。

tushare通用行情接口pro_bar整合了股票(未复权、前复权、后复权)、指数、数字货币、ETF基金、期货、期权的行情数据,未来还将整合包括外汇在内的所有交易行情数据,同时提供分钟数据。不同数据对应不同的积分要求,积分不够不能使用pro_bar接口:Tushare数据

#取000001的前复权行情
df = ts.pro_bar(ts_code='000001.SZ', adj='qfq', start_date='20180101', end_date='20181011')

              ts_code trade_date     open     high      low    close  \
trade_date
20181011    000001.SZ   20181011  1085.71  1097.59  1047.90  1065.19
20181010    000001.SZ   20181010  1138.65  1151.61  1121.36  1128.92
20181009    000001.SZ   20181009  1130.00  1155.93  1122.44  1140.81
20181008    000001.SZ   20181008  1155.93  1165.65  1128.92  1128.92
20180928    000001.SZ   20180928  1164.57  1217.51  1164.57  1193.74

#取上证指数行情数据
df = ts.pro_bar(ts_code='000001.SH', asset='I', start_date='20180101', end_date='20181011')

In [10]: df.head()
Out[10]:
     ts_code trade_date      close       open       high        low  \
0  000001.SH   20181011  2583.4575  2643.0740  2661.2859  2560.3164
1  000001.SH   20181010  2725.8367  2723.7242  2743.5480  2703.0626
2  000001.SH   20181009  2721.0130  2713.7319  2734.3142  2711.1971
3  000001.SH   20181008  2716.5104  2768.2075  2771.9384  2710.1781
4  000001.SH   20180928  2821.3501  2794.2644  2821.7553  2791.8363

   pre_close    change  pct_chg          vol       amount
0  2725.8367 -142.3792     -5.2233  197150702.0  170057762.5
1  2721.0130    4.8237      0.1773  113485736.0  111312455.3
2  2716.5104    4.5026      0.1657  116771899.0  110292457.8
3  2821.3501 -104.8397     -3.7159  149501388.0  141531551.8
4  2791.7748   29.5753      1.0594  134290456.0  125369989.4

#均线
df = ts.pro_bar(ts_code='000001.SZ', start_date='20180101', end_date='20181011', ma=[5, 20, 50])
注:Tushare pro_bar接口的均价和均量数据是动态计算,想要获取某个时间段的均线,必须要设置start_date日期大于最大均线的日期数,然后自行截取想要日期段。例如,想要获取20190801开始的3日均线,必须设置start_date='20190729',然后剔除20190801之前的日期记录。


#换手率tor,量比vr
df = ts.pro_bar(ts_code='000001.SZ', start_date='20180101', end_date='20181011', factors=['tor', 'vr'])

对于pro_api参数,如果在一开始就通过 ts.set_token('xxxx') 设置过token的情况,这个参数就不是必需的。
例如:
df = ts.pro_bar(ts_code='000001.SH', asset='I', start_date='20180101', end_date='20181011')

下表是使用 pro.daily获取的行情数据通过本地数据查看:

| 236949 | 000001.SZ | 20200213   | 14.71 | 14.88 | 14.61 | 14.65 |     14.77 |  -0.12 | -0.8125 | 1013205.28 | 1491327.713 |
| 236950 | 000001.SZ | 20200212   | 14.79 | 14.82 | 14.60 | 14.77 |     14.79 |  -0.02 | -0.1352 | 1070503.21 | 1573229.042 |
| 236951 | 000001.SZ | 20200211   | 14.60 | 14.94 | 14.56 | 14.79 |     14.50 |   0.29 |  2.0000 | 1407507.44 | 2077194.138 |
| 236952 | 000001.SZ | 20200210   | 14.51 | 14.53 | 14.30 | 14.50 |     14.62 |  -0.12 | -0.8208 | 1339495.24 | 1931983.482 |
| 236953 | 000001.SZ | 20200207   | 14.60 | 14.69 | 14.41 | 14.62 |     14.77 |  -0.15 | -1.0156 |  924852.96 | 1345053.255 |
| 236954 | 000001.SZ | 20200206   | 14.81 | 14.87 | 14.51 | 14.77 |     14.63 |   0.14 |  0.9569 | 1185815.72 | 1740107.625 |
| 236955 | 000001.SZ | 20200205   | 14.59 | 14.89 | 14.32 | 14.63 |     14.60 |   0.03 |  0.2055 | 1491380.21 | 2177632.043 |
| 236956 | 000001.SZ | 20200204   | 14.05 | 14.66 | 14.02 | 14.60 |     13.99 |   0.61 |  4.3603 | 1706172.07 | 2442932.842 |
| 236957 | 000001.SZ | 20200203   | 13.99 | 14.70 | 13.99 | 13.99 |     15.54 |  -1.55 | -9.9743 | 2259194.83 | 3201454.164 |
+--------+-----------+------------+-------+-------+-------+-------+-----------+--------+---------+------------+-------------+
1098 rows in set (0.01 sec)

mysql> select * from stock_daily where ts_code='000001.SZ';
+--------+-----------+------------+-------+-------+-------+-------+-----------+--------+---------+------------+-------------+
| id     | ts_code   | trade_date | open  | high  | low   | close | pre_close | change | pct_chg | vol        | amount      |
+--------+-----------+------------+-------+-------+-------+-------+-----------+--------+---------+------------+-------------+
| 169923 | 000001.SZ | 20200123   | 15.92 | 15.92 | 15.39 | 15.54 |     16.09 |  -0.55 | -3.4183 | 1100592.07 | 1723394.336 |
| 169924 | 000001.SZ | 20200122   | 15.92 | 16.16 | 15.71 | 16.09 |     16.00 |   0.09 |  0.5625 |  719464.91 | 1150933.398 |
| 169925 | 000001.SZ | 20200121   | 16.34 | 16.34 | 15.93 | 16.00 |     16.45 |  -0.45 | -2.7356 |  896603.10 | 1442171.431 |
| 169926 | 000001.SZ | 20200120   | 16.43 | 16.61 | 16.35 | 16.45 |     16.39 |   0.06 |  0.3661 |  746074.75 | 1226464.649 |
| 169927 | 000001.SZ | 20200117   | 16.38 | 16.55 | 16.35 | 16.39 |     16.33 |   0.06 |  0.3674 |  605436.69 |  995909.007 |
| 169928 | 000001.SZ | 20200116   | 16.52 | 16.57 | 16.20 | 16.33 |     16.52 |  -0.19 | -1.1501 | 1028104.67 | 1678888.507 |
| 169929 | 000001.SZ | 20200115   | 16.79 | 16.86 | 16.45 | 16.52 |     16.76 |  -0.24 | -1.4320 |  859439.12 | 1424889.228 |
| 169930 | 000001.SZ | 20200114   | 16.99 | 17.27 | 16.76 | 16.76 |     16.99 |  -0.23 | -1.3537 | 1304493.66 | 2217608.852 |
| 169931 | 000001.SZ | 20200113   | 16.75 | 17.03 | 16.61 | 16.99 |     16.69 |   0.30 |  1.7975 |  872133.36 | 1468271.683 |
| 169932 | 000001.SZ | 20200110   | 16.79 | 16.81 | 16.52 | 16.69 |     16.79 |  -0.10 | -0.5956 |  585548.45 |  975154.818 |
| 169933 | 000001.SZ | 20200109   | 16.81 | 16.93 | 16.53 | 16.79 |     16.66 |   0.13 |  0.7803 | 1031636.65 | 1725326.806 |
| 169934 | 000001.SZ | 20200108   | 17.00 | 17.05 | 16.63 | 16.66 |     17.15 |  -0.49 | -2.8571 |  847824.12 | 1423608.811 |
| 169935 | 000001.SZ | 20200107   | 17.13 | 17.28 | 16.95 | 17.15 |     17.07 |   0.08 |  0.4687 |  728607.56 | 1247047.135 |
| 169936 | 000001.SZ | 20200106   | 17.01 | 17.34 | 16.91 | 17.07 |     17.18 |  -0.11 | -0.6403 |  862083.50 | 1477930.193 |
| 169937 | 000001.SZ | 20200103   | 16.94 | 17.31 | 16.92 | 17.18 |     16.87 |   0.31 |  1.8376 | 1116194.81 | 1914495.474 |
| 169938 | 000001.SZ | 20200102   | 16.65 | 16.95 | 16.55 | 16.87 |     16.45 |   0.42 |  2.5532 | 1530231.87 | 2571196.482 |
| 235876 | 000001.SZ | 20240715   | 10.31 | 10.35 | 10.26 | 10.33 |     10.31 |   0.02 |  0.1940 |  869412.47 |  896975.833 |
| 235877 | 000001.SZ | 20240712   | 10.12 | 10.31 | 10.12 | 10.31 |     10.13 |   0.18 |  1.7769 | 1214163.69 | 1246877.666 |
| 235878 | 000001.SZ | 20240711   | 10.21 | 10.23 | 10.07 | 10.13 |     10.14 |  -0.01 | -0.0986 |  721392.25 |  731789.332 |
| 235879 | 000001.SZ | 20240710   | 10.07 | 10.20 | 10.05 | 10.14 |     10.07 |   0.07 |  0.6951 |  873865.11 |  886626.727 |
| 235880 | 000001.SZ | 20240709   |  9.94 | 10.10 |  9.91 | 10.07 |      9.93 |   0.14 |  1.4099 | 1032485.30 | 1034316.511 |
| 235881 | 000001.SZ | 20240708   |  9.94 | 10.02 |  9.85 |  9.93 |      9.97 |  -0.04 | -0.4012 |  868659.62 |  864791.776 |
| 235882 | 000001.SZ | 20240705   | 10.26 | 10.29 |  9.92 |  9.97 |     10.26 |  -0.29 | -2.8265 | 1713353.86 | 1721256.576 |
| 235883 | 000001.SZ | 20240704   | 10.30 | 10.38 | 10.25 | 10.26 |     10.31 |  -0.05 | -0.4850 |  735044.24 |  757464.387 |
| 235884 | 000001.SZ | 20240703   | 10.40 | 10.43 | 10.29 | 10.31 |     10.40 |  -0.09 | -0.8654 |  713067.46 |  737326.421 |
| 235885 | 000001.SZ | 20240702   | 10.30 | 10.48 | 10.28 | 10.40 |     10.35 |   0.05 |  0.4831 | 1384385.70 | 1440864.391 |
^C -- query aborted
+--------+-----------+------------+-------+-------+-------+-------+-----------+--------+---------+------------+-------------+
1098 rows in set (0.03 sec)

mysql> SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'stock' AND table_name = 'stock_daily';
+-------------+------------+
| table_name  | table_rows |
+-------------+------------+
| stock_daily |      58964 |
+-------------+------------+
1 row in set (0.00 sec)

mysql> SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'stock' AND table_name = 'stock_basic';
+-------------+------------+
| table_name  | table_rows |
+-------------+------------+
| stock_basic |       5317 |
+-------------+------------+
1 row in set (0.00 sec)

mysql>

使用 pro.daily接口更新日线行情数据到当前最新日期。因为某些不可描述的原因stock_daily表分2次(笔记3和4)下载完全。

import pandas as pd
import tushare
import time

print (tushare.__version__)
tushare.set_token('f9069ca5e3931347503e81967e161590b3c3859e8cba31e94da1f517')
pro = tushare.pro_api()

import pymysql
#SELECT table_name, table_rows, data_length, index_length FROM stock_basic;
try:
    db = pymysql.connect(host='127.0.0.1', user='root', passwd='password', db='stock', charset='utf8')
    cursor = db.cursor()
    cursor.execute("select ts_code,list_date from stock_basic")
    stocks = cursor.fetchall() #所有股票清单
except Exception as e:   
    print("got error while connecting to db: \n%s\n"%e)
    
    
#减小计算量仅下载2020年之后的日线数据
start_date='20200101'

end_date = int(time.strftime("%Y%m%d", time.localtime()))
query = "INSERT ignore  INTO stock_daily(ts_code,trade_date,open,high,low,close,pre_close,`change`,pct_chg,vol,amount) VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);"

try:
    #cursor.execute("select ts_code,list_date from stock_basic limit 1")
    cursor.execute("select id, ts_code from stock_basic")
    stocks = cursor.fetchall()     

    for stock in stocks: #all stock      
      ts_code = stock[1]
      id = stock[0]
      cursor.execute("select max(trade_date) from stock_daily where ts_code = \'%s\'" % ts_code)
      trade_data = cursor.fetchone()
      if trade_data[0]:
          start_data = int(trade_data[0])+1
      
      #print("id: %s, ts_code: %s, start_data: %s"%(id, ts_code, start_data))      
      
      '''
      id: 297, ts_code: 000809.SZ, start_data: 20200124
got error insert items to db: 
HTTPConnectionPool(host='api.waditu.com', port=80): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x06E8A810>: Failed to resolve 'api.waditu.com' ([Errno 11004] getaddrinfo failed)"))
      因为pro.daily是一个比较早期的借口tushare已经不再维护,经常出现下载数据失败的情况,所以需要记录失败的位置,然后下一次从失败位置开始继续下载:id >= 297
      '''
      if id >= 0:  #all the stocks
        print("id: %s, ts_code: %s, start_data: %s"%(id, ts_code, start_data))              
        df = pro.daily(ts_code=ts_code, start_date=start_date, end_date=end_date)
        
        #print(df.columns)
        #df.shape
        #print(df.dtypes)        
        #df.info(verbose=True)
        time.sleep(1)

        for r in range(0, len(df)):            
            ts_code = df.iloc[r,0]
            trade_date = df.iloc[r,1]
            open_price = df.iloc[r,2]
            high= df.iloc[r,3]
            low = df.iloc[r,4]
            close = df.iloc[r,5]
            pre_close = df.iloc[r,6]
            change = df.iloc[r,7]
            pct_chg = df.iloc[r,8]
            vol = df.iloc[r,9]
            amount = df.iloc[r,10]
        
            values = (ts_code,trade_date,open_price,high,low,close,pre_close,change,pct_chg,vol,amount)
            #print(values)            
            cursor.execute(query,values)
        db.commit()
except Exception as e:   
    print("got error insert items to db: \n%s\n"%e)
    
finally:   
    db.commit()
    cursor.close()
    db.close()
    print("数据库连接关闭!")

今晚中断了好几次只更新了500多只股票,又断了:

id: 548, ts_code: 002031.SZ, start_data: 20200124
id: 549, ts_code: 002032.SZ, start_data: 20200124
id: 550, ts_code: 002033.SZ, start_data: 20200124
id: 551, ts_code: 002034.SZ, start_data: 20200124
got error insert items to db: 
HTTPConnectionPool(host='api.waditu.com', port=80): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x06E8A1D0>: Failed to resolve 'api.waditu.com' ([Errno 11004] getaddrinfo failed)"))

数据库连接关闭!

下面是stock_daily记录增加后的数据量:

mysql> SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'stock' AND table_name = 'stock_daily';
+-------------+------------+
| table_name  | table_rows |
+-------------+------------+
| stock_daily |     154919 |
+-------------+------------+
1 row in set (0.00 sec)

mysql> SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'stock' AND table_name = 'stock_daily';
+-------------+------------+
| table_name  | table_rows |
+-------------+------------+
| stock_daily |     375256 |
+-------------+------------+
1 row in set (0.00 sec)

mysql> SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = 'stock' AND table_name = 'stock_daily';
+-------------+------------+
| table_name  | table_rows |
+-------------+------------+
| stock_daily |     593003 |
+-------------+------------+
1 row in set (0.00 sec)

mysql>
相关推荐
格林威5 分钟前
Baumer工业相机堡盟工业相机如何通过NEOAPI SDK使用超短曝光功能(曝光可设置1微秒)(Python)
开发语言·人工智能·python·数码相机·计算机视觉
AI原吾15 分钟前
探索SVG的奥秘:Python中的svgwrite库
android·开发语言·python·svgwrite
Tinalee-电商API接口呀25 分钟前
python爬虫爬取淘宝商品比价||淘宝商品详情API接口
大数据·开发语言·人工智能·爬虫·python·json
week_泽27 分钟前
安装python,jupter notebook,anaconda换源
开发语言·python
小时候的阳光1 小时前
Docker方式部署ProxySQL和Keepalived组合实现MGR的高可用访问
mysql·docker·keepalived·mgr·proxysql
机智的小神仙儿2 小时前
我是Redis,请看我和mysql是如何交互的吧~
redis·mysql·缓存·交互
程序员大金2 小时前
基于SpringBoot+Vue+MySQL的垃圾分类回收管理系统
java·vue.js·spring boot·后端·mysql·mybatis
盒马盒马2 小时前
MySQL:事务
数据库·mysql
计算机学姐2 小时前
基于Python的可视化在线学习系统
开发语言·vue.js·后端·python·学习·mysql·django
计算机学姐2 小时前
基于Python的电影票房数据分析系统
开发语言·vue.js·hive·后端·python·spark·django