基于大数据的短视频用户兴趣分析-hive+django+spider

  1. 开发语言:Python
  2. 框架:django
  3. Python版本:python3.8
  4. 数据库:mysql 5.7
  5. 数据库工具:Navicat12
  6. 开发软件:PyCharm

系统展示

管理员登录

管理员功能界面

短视频界面

短视频预测界面

看板展示

摘要

系统基于Django框架进行开发,利用Python语言进行业务逻辑的实现。借助Hadoop大数据平台,能够高效处理和存储海量的短视频相关数据。同时,采用随机森林回归算法对用户兴趣进行建模和预测,挖掘用户潜在的兴趣偏好。使用Echarts工具对分析结果进行可视化展示,使数据更加直观易懂。系统赋予管理员强大的管理功能,涵盖了对短视频的全方位管理,包括但不限于短视频的上传、审核、分类等操作,同时还对短视频的相关预测功能进行管理,如用户兴趣预测、视频热度预测等。通过整合多种先进技术,构建一个功能完善的短视频用户兴趣分析平台。通过对用户兴趣的精准分析,为短视频平台提供决策支持,优化内容推荐策略,提高用户的活跃度和留存率,同时也为广告投放、内容创作等相关业务提供有价值的参考,促进短视频行业的健康发展。

研究背景

大型短视频平台每天产生的数据量可达数PB级别,其中包含海量的用户行为数据、视频内容数据以及用户关系数据等。如此庞大的数据规模,远远超出了传统数据分析工具和方法的处理能力。然而,这些数据恰恰是洞察用户兴趣的宝贵资源。若能对其进行深入挖掘与分析,精准掌握用户的兴趣偏好,不仅可以帮助短视频平台优化内容推荐算法,为用户提供更符合其兴趣的视频内容,提升用户满意度和忠诚度,还能为广告主实现精准营销提供有力支持,推动整个短视频产业的进一步发展。这正是开展基于大数据的短视频用户兴趣分析研究的重要背景与现实意义。

关键技术

Python是解释型的脚本语言,在运行过程中,把程序转换为字节码和机器语言,说明性语言的程序在运行之前不必进行编译,而是一个专用的解释器,当被执行时,它都会被翻译,与之对应的还有编译性语言。

同时,这也是一种用于电脑编程的跨平台语言,这是一门将编译、交互和面向对象相结合的脚本语言(script language)。

Django用Python编写,属于开源Web应用程序框架。采用(模型M、视图V和模板t)的框架模式。该框架以比利时吉普赛爵士吉他手詹戈·莱因哈特命名。该架构的主要组件如下:

1.用于创建模型的对象关系映射。

2.最终目标是为用户设计一个完美的管理界面。

3.是目前最流行的URL设计解决方案。

4.模板语言对设计师来说是最友好的。

5.缓存系统。

Vue是一款流行的开源JavaScript框架,用于构建用户界面和单页面应用程序。Vue的核心库只关注视图层,易于上手并且可以与其他库或现有项目轻松整合。

MYSQL数据库运行速度快,安全性能也很高,而且对使用的平台没有任何的限制,所以被广泛应运到系统的开发中。MySQL是一个开源和多线程的关系管理数据库系统,MySQL是开放源代码的数据库,具有跨平台性。

B/S(浏览器/服务器)结构是目前主流的网络化的结构模式,它能够把系统核心功能集中在服务器上面,可以帮助系统开发人员简化操作,便于维护和使用。

系统分析

对系统的可行性分析以及对所有功能需求进行详细的分析,来查看该系统是否具有开发的可能。

系统设计

功能模块设计和数据库设计这两部分内容都有专门的表格和图片表示。

系统实现

管理员成功登录系统后,将拥有广泛的权限,能够管理多个功能模块,包括但不限于系统首页、个人中心、短视频、短视频预测等管理。短视频用户兴趣分析看板展示,作者信息:展示了诸如村驴、海上厨子等作者昵称,呈现创作者群体。视频数量:明确标注短视频总数为1750条,体现内容规模。视频热度:列出点赞数TOP10短视频,如妈妈,我回来啦等,附带话题标签,直观呈现热门内容。分辨率:通过饼图展示,1080p占比最大,其余360p、480p等有相应占比,反映视频画质分布。点赞数统计:以柱状图呈现,爱跑步的主持人郭靖等创作者点赞数清晰可见。评论数统计:折线图展示,部分视频评论数变化趋势一目了然。分享数统计:采用特殊图表,展示洪千辰等创作者分享情况。收藏数统计:柱状图呈现,胡茜Annnnn等创作者收藏数据可辨。还设有预测数据区域,可输入标题和作者昵称查询预测的评论、点赞等数据。

代码实现

python 复制代码
#coding:utf-8

#获取当前文件路径的根目录
parent_directory = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
dbtype, host, port, user, passwd, dbName, charset,hasHadoop = config_read(os.path.join(parent_directory,"config.ini"))
#MySQL连接配置
mysql_config = {
    'host': host,
    'user':user,
    'password': passwd,
    'database': dbName,
    'port':port
}

#获取预测可视化图表接口
def douyinrmforecast_forecastimgs(request):
    if request.method in ["POST", "GET"]:
        msg = {'code': normal_code, 'message': 'success'}
        # 指定目录
        directory = os.path.join(parent_directory, "templates", "upload", "douyinrmforecast")
        # 获取目录下的所有文件和文件夹名称
        all_items = os.listdir(directory)
        # 过滤出文件(排除文件夹)
        files = [f'upload/douyinrmforecast/{item}' for item in all_items if os.path.isfile(os.path.join(directory, item))]
        msg["data"] = files
        fontlist=[]
        for font in fm.fontManager.ttflist:
            fontlist.append(font.name)
        msg["message"]=fontlist
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_forecast(request):
    if request.method in ["POST", "GET"]:
        msg = {'code': normal_code, "msg": mes.normal_code}
        #1.获取数据集
        req_dict = request.session.get("req_dict")
        connection = pymysql.connect(**mysql_config)
        query = "SELECT title,author, comment_count,digg_count,share_count,collect_count FROM douyinrm"
        #2.处理缺失值
        data = pd.read_sql(query, connection).dropna()
        id = req_dict.pop('id',None)
        df = to_forecast(data,req_dict,None)
        #9.创建数据库连接,将DataFrame 插入数据库
        connection_string = f"mysql+pymysql://{mysql_config['user']}:{mysql_config['password']}@{mysql_config['host']}:{mysql_config['port']}/{mysql_config['database']}"
        engine = create_engine(connection_string)
        try:
            if req_dict :
                #遍历 DataFrame,并逐行更新数据库
                with engine.connect() as connection:
                    for index, row in df.iterrows():
                        sql = """
                        INSERT INTO douyinrmforecast (id
                        ,comment_count
                        ,digg_count
                        ,share_count
                        ,collect_count
                        )
                        VALUES (%(id)s
                    ,%(comment_count)s
                    ,%(digg_count)s
                    ,%(share_count)s
                    ,%(collect_count)s
                        )
                        ON DUPLICATE KEY UPDATE
                        comment_count = VALUES(comment_count),
                        digg_count = VALUES(digg_count),
                        share_count = VALUES(share_count),
                        collect_count = VALUES(collect_count)
                        """
                        connection.execute(sql, {'id': id
                            , 'comment_count': row['comment_count']
                            , 'digg_count': row['digg_count']
                            , 'share_count': row['share_count']
                            , 'collect_count': row['collect_count']
                        })
            else:
                df.to_sql('douyinrmforecast', con=engine, if_exists='append', index=False)
            print("数据更新成功!")
        except Exception as e:
            print(f"发生错误: {e}")
        finally:
            engine.dispose()  # 关闭数据库连接
        return JsonResponse(msg, encoder=CustomJsonEncoder)
def to_forecast(data,req_dict,value):
    if len(data) < 5:
        print(f"的样本数量不足: {len(data)}")
        return pd.DataFrame()
    #3.处理特征值和目标值
    labels={}
    for key in data.keys():
        if pd.api.types.is_string_dtype(data[key]):
            label_encoder = LabelEncoder()
            labels[key] = label_encoder
            data[key] = label_encoder.fit_transform(data[key])
    #4.数据集划分
    X = data[[
        'title',
        'author',
    ]]
    y = data[[
        'comment_count',
        'digg_count',
        'share_count',
        'collect_count',
    ]]
    x_train, x_test, y_train, y_test = train_test_split(X, y,test_size=0.2, random_state=22)
    #5.构建预测特征值
    #根据输入的特征值去预测
    if req_dict:
        req_dict.pop('addtime',None)
        future_df = pd.DataFrame([req_dict])
        for key in future_df.keys():
           if key in labels:
               encoder = labels[key]
               values = future_df[key][0]
               try:
                   values = encoder.transform([values])[0]
               except ValueError as e: #处理未见过的标签
                   values = np.array([encoder.transform([v])[0] if v in encoder.classes_ else -1 for v in values]).sum()
               future_df[key][0] = values
    else:
        future_df = x_test
    #特征工程-标准化
    estimator_file = os.path.join(parent_directory, "douyinrmforecast.pkl")
    estimator = RandomForestRegressor(n_estimators=100, random_state=42)
    _, num_columns = y_train.shape
    if num_columns>=2:
        estimator.fit(x_train, y_train)
    else:
        estimator.fit(x_train, y_train.values.ravel())
    y_pred = estimator.predict(x_test)
    plt.rcParams['font.sans-serif'] = ['SimHei']  # 使用黑体 SimHei
    plt.rcParams['axes.unicode_minus'] = False  # 解决负号 '-' 显示为方块的问题
    # 绘制预测值与实际值的散点图
    plt.figure(figsize=(10, 6))
    plt.scatter(y_test, y_pred, alpha=0.5)
    plt.xlabel("实际值")
    plt.ylabel("预测值")
    plt.title("实际值与预测值(随机森林回归)")
    directory =os.path.join(parent_directory, "templates","upload","douyinrmforecast","figure.png")
    os.makedirs(os.path.dirname(directory), exist_ok=True)
    plt.savefig(directory)
    plt.clf()
    # 绘制特征重要性
    feature_importances = estimator.feature_importances_
    features = [
        'title',
        'author',
    ]
    sns.barplot(x=feature_importances, y=features)
    plt.xlabel("重要性得分")
    plt.ylabel("特征")
    plt.title("特征重要性")
    if value!=None:
        directory =os.path.join(parent_directory, "templates","upload","douyinrmforecast","{value}_figure.png")
        os.makedirs(os.path.dirname(directory), exist_ok=True)
        plt.savefig(directory)
    else:
        directory =os.path.join(parent_directory, "templates","upload","douyinrmforecast","figure_other.png")
        os.makedirs(os.path.dirname(directory), exist_ok=True)
        plt.savefig(directory)
    plt.clf()
    #保存模型
    joblib.dump(estimator, estimator_file)

    #7.进行预测
    y_predict = estimator.predict(future_df)
    if isinstance(y_predict[0], numbers.Number) or len(y_predict[0])<2:
        y_predict = np.mean(y_predict, axis=0)
        if not isinstance(y_predict, np.ndarray):
            y_predict = np.expand_dims(y_predict, axis=0)
    df = pd.DataFrame(y_predict, columns=[
        'comment_count',
        'digg_count',
        'share_count',
        'collect_count',
    ])
    df['comment_count']=df['comment_count'].astype(int)
    df['digg_count']=df['digg_count'].astype(int)
    df['share_count']=df['share_count'].astype(int)
    df['collect_count']=df['collect_count'].astype(int)
    return df

def douyinrmforecast_register(request):
    if request.method in ["POST", "GET"]:
        msg = {'code': normal_code, "msg": mes.normal_code}
        req_dict = request.session.get("req_dict")


        error = douyinrmforecast.createbyreq(douyinrmforecast, douyinrmforecast, req_dict)
        if error is Exception or (type(error) is str and "Exception" in error):
            msg['code'] = crud_error_code
            msg['msg'] = "用户已存在,请勿重复注册!"
        else:
            msg['data'] = error
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_login(request):
    if request.method in ["POST", "GET"]:
        msg = {'code': normal_code, "msg": mes.normal_code}
        req_dict = request.session.get("req_dict")
        datas = douyinrmforecast.getbyparams(douyinrmforecast, douyinrmforecast, req_dict)
        if not datas:
            msg['code'] = password_error_code
            msg['msg'] = mes.password_error_code
            return JsonResponse(msg, encoder=CustomJsonEncoder)

        try:
            __sfsh__= douyinrmforecast.__sfsh__
        except:
            __sfsh__=None

        if  __sfsh__=='是':
            if datas[0].get('sfsh')!='是':
                msg['code']=other_code
                msg['msg'] = "账号已锁定,请联系管理员审核!"
                return JsonResponse(msg, encoder=CustomJsonEncoder)
                
        req_dict['id'] = datas[0].get('id')


        return Auth.authenticate(Auth, douyinrmforecast, req_dict)


def douyinrmforecast_logout(request):
    if request.method in ["POST", "GET"]:
        msg = {
            "msg": "登出成功",
            "code": 0
        }

        return JsonResponse(msg, encoder=CustomJsonEncoder)


def douyinrmforecast_resetPass(request):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code}

        req_dict = request.session.get("req_dict")

        columns=  douyinrmforecast.getallcolumn( douyinrmforecast, douyinrmforecast)

        try:
            __loginUserColumn__= douyinrmforecast.__loginUserColumn__
        except:
            __loginUserColumn__=None
        username=req_dict.get(list(req_dict.keys())[0])
        if __loginUserColumn__:
            username_str=__loginUserColumn__
        else:
            username_str=username
        if 'mima' in columns:
            password_str='mima'
        else:
            password_str='password'

        init_pwd = '123456'
        recordsParam = {}
        recordsParam[username_str] = req_dict.get("username")
        records=douyinrmforecast.getbyparams(douyinrmforecast, douyinrmforecast, recordsParam)
        if len(records)<1:
            msg['code'] = 400
            msg['msg'] = '用户不存在'
            return JsonResponse(msg, encoder=CustomJsonEncoder)

        eval('''douyinrmforecast.objects.filter({}='{}').update({}='{}')'''.format(username_str,username,password_str,init_pwd))
        
        return JsonResponse(msg, encoder=CustomJsonEncoder)



def douyinrmforecast_session(request):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code,"msg": mes.normal_code, "data": {}}

        req_dict={"id":request.session.get('params').get("id")}
        msg['data']  = douyinrmforecast.getbyparams(douyinrmforecast, douyinrmforecast, req_dict)[0]

        return JsonResponse(msg, encoder=CustomJsonEncoder)


def douyinrmforecast_default(request):

    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code,"msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")
        req_dict.update({"isdefault":"是"})
        data=douyinrmforecast.getbyparams(douyinrmforecast, douyinrmforecast, req_dict)
        if len(data)>0:
            msg['data']  = data[0]
        else:
            msg['data']  = {}
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_page(request):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code,  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = request.session.get("req_dict")

        global douyinrmforecast
        #当前登录用户信息
        tablename = request.session.get("tablename")

        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  =douyinrmforecast.page(douyinrmforecast, douyinrmforecast, req_dict, request)
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_autoSort(request):
    '''
    .智能推荐功能(表属性:[intelRecom(是/否)],新增clicktime[前端不显示该字段]字段(调用info/detail接口的时候更新),按clicktime排序查询)
主要信息列表(如商品列表,新闻列表)中使用,显示最近点击的或最新添加的5条记录就行
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code,  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = request.session.get("req_dict")
        if "clicknum"  in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast):
            req_dict['sort']='clicknum'
        elif "browseduration"  in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast):
            req_dict['sort']='browseduration'
        else:
            req_dict['sort']='clicktime'
        req_dict['order']='desc'
        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  = douyinrmforecast.page(douyinrmforecast,douyinrmforecast, req_dict)

        return JsonResponse(msg, encoder=CustomJsonEncoder)

#分类列表
def douyinrmforecast_lists(request):
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code,  "data":[]}
        msg['data'],_,_,_,_  = douyinrmforecast.page(douyinrmforecast, douyinrmforecast, {})
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_query(request):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        try:
            query_result = douyinrmforecast.objects.filter(**request.session.get("req_dict")).values()
            msg['data'] = query_result[0]
        except Exception as e:

            msg['code'] = crud_error_code
            msg['msg'] = f"发生错误:{e}"
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_list(request):
    '''
    前台分页
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code,  "data":{"currPage":1,"totalPage":1,"total":1,"pageSize":10,"list":[]}}
        req_dict = request.session.get("req_dict")
        #获取全部列名
        columns=  douyinrmforecast.getallcolumn( douyinrmforecast, douyinrmforecast)
        if "vipread" in req_dict and "vipread" not in columns:
          del req_dict["vipread"]
        #表属性[foreEndList]前台list:和后台默认的list列表页相似,只是摆在前台,否:指没有此页,是:表示有此页(不需要登陆即可查看),前要登:表示有此页且需要登陆后才能查看
        try:
            __foreEndList__=douyinrmforecast.__foreEndList__
        except:
            __foreEndList__=None
        try:
            __foreEndListAuth__=douyinrmforecast.__foreEndListAuth__
        except:
            __foreEndListAuth__=None

        #authSeparate
        try:
            __authSeparate__=douyinrmforecast.__authSeparate__
        except:
            __authSeparate__=None

        if __foreEndListAuth__ =="是" and __authSeparate__=="是":
            tablename=request.session.get("tablename")
            if tablename!="users" and request.session.get("params") is not None:
                req_dict['userid']=request.session.get("params").get("id")

        tablename = request.session.get("tablename")
        if tablename == "users" and req_dict.get("userid") != None:#判断是否存在userid列名
            del req_dict["userid"]
        else:
            __isAdmin__ = None

            allModels = apps.get_app_config('main').get_models()
            for m in allModels:
                if m.__tablename__==tablename:

                    try:
                        __isAdmin__ = m.__isAdmin__
                    except:
                        __isAdmin__ = None
                    break

            if __isAdmin__ == "是":
                if req_dict.get("userid"):
                    # del req_dict["userid"]
                    pass
            else:
                #非管理员权限的表,判断当前表字段名是否有userid
                if "userid" in columns:
                    try:
                        pass
                    except:
                        pass
        #当列属性authTable有值(某个用户表)[该列的列名必须和该用户表的登陆字段名一致],则对应的表有个隐藏属性authTable为"是",那么该用户查看该表信息时,只能查看自己的
        try:
            __authTables__=douyinrmforecast.__authTables__
        except:
            __authTables__=None

        if __authTables__!=None and  __authTables__!={} and __foreEndListAuth__=="是":
            for authColumn,authTable in __authTables__.items():
                if authTable==tablename:
                    try:
                        del req_dict['userid']
                    except:
                        pass
                    params = request.session.get("params")
                    req_dict[authColumn]=params.get(authColumn)
                    username=params.get(authColumn)
                    break
        
        if douyinrmforecast.__tablename__[:7]=="discuss":
            try:
                del req_dict['userid']
            except:
                pass


        q = Q()
        msg['data']['list'], msg['data']['currPage'], msg['data']['totalPage'], msg['data']['total'], \
        msg['data']['pageSize']  = douyinrmforecast.page(douyinrmforecast, douyinrmforecast, req_dict, request, q)
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_save(request):
    '''
    后台新增
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")
        if 'clicktime' in req_dict.keys():
            del req_dict['clicktime']
        tablename=request.session.get("tablename")
        __isAdmin__ = None
        allModels = apps.get_app_config('main').get_models()
        for m in allModels:
            if m.__tablename__==tablename:

                try:
                    __isAdmin__ = m.__isAdmin__
                except:
                    __isAdmin__ = None
                break

        #获取全部列名
        columns=  douyinrmforecast.getallcolumn( douyinrmforecast, douyinrmforecast)
        if tablename!='users' and req_dict.get("userid")==None and 'userid' in columns  and __isAdmin__!='是':
            params=request.session.get("params")
            req_dict['userid']=params.get('id')


        if 'addtime' in req_dict.keys():
            del req_dict['addtime']

        idOrErr= douyinrmforecast.createbyreq(douyinrmforecast,douyinrmforecast, req_dict)
        if idOrErr is Exception:
            msg['code'] = crud_error_code
            msg['msg'] = idOrErr
        else:
            msg['data'] = idOrErr

        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_add(request):
    '''
    前台新增
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")
        tablename=request.session.get("tablename")

        #获取全部列名
        columns=  douyinrmforecast.getallcolumn( douyinrmforecast, douyinrmforecast)
        try:
            __authSeparate__=douyinrmforecast.__authSeparate__
        except:
            __authSeparate__=None

        if __authSeparate__=="是":
            tablename=request.session.get("tablename")
            if tablename!="users" and 'userid' in columns:
                try:
                    req_dict['userid']=request.session.get("params").get("id")
                except:
                    pass

        try:
            __foreEndListAuth__=douyinrmforecast.__foreEndListAuth__
        except:
            __foreEndListAuth__=None

        if __foreEndListAuth__ and __foreEndListAuth__!="否":
            tablename=request.session.get("tablename")
            if tablename!="users":
                req_dict['userid']=request.session.get("params").get("id")


        if 'addtime' in req_dict.keys():
            del req_dict['addtime']
        error= douyinrmforecast.createbyreq(douyinrmforecast,douyinrmforecast, req_dict)
        if error is Exception:
            msg['code'] = crud_error_code
            msg['msg'] = error
        else:
            msg['data'] = error
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_thumbsup(request,id_):
    '''
     点赞:表属性thumbsUp[是/否],刷表新增thumbsupnum赞和crazilynum踩字段,
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")
        id_=int(id_)
        type_=int(req_dict.get("type",0))
        rets=douyinrmforecast.getbyid(douyinrmforecast,douyinrmforecast,id_)

        update_dict={
        "id":id_,
        }
        if type_==1:#赞
            update_dict["thumbsupnum"]=int(rets[0].get('thumbsupnum'))+1
        elif type_==2:#踩
            update_dict["crazilynum"]=int(rets[0].get('crazilynum'))+1
        error = douyinrmforecast.updatebyparams(douyinrmforecast,douyinrmforecast, update_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return JsonResponse(msg, encoder=CustomJsonEncoder)


def douyinrmforecast_info(request,id_):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}

        data = douyinrmforecast.getbyid(douyinrmforecast,douyinrmforecast, int(id_))
        if len(data)>0:
            msg['data']=data[0]
            if msg['data'].__contains__("reversetime"):
                if isinstance(msg['data']['reversetime'], datetime.datetime):
                    msg['data']['reversetime'] = msg['data']['reversetime'].strftime("%Y-%m-%d %H:%M:%S")
                else:
                    if msg['data']['reversetime'] != None:
                        reversetime = datetime.datetime.strptime(msg['data']['reversetime'], '%Y-%m-%d %H:%M:%S')
                        msg['data']['reversetime'] = reversetime.strftime("%Y-%m-%d %H:%M:%S")

        #浏览点击次数
        try:
            __browseClick__= douyinrmforecast.__browseClick__
        except:
            __browseClick__=None

        if __browseClick__=="是"  and  "clicknum"  in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast):
            try:
                clicknum=int(data[0].get("clicknum",0))+1
            except:
                clicknum=0+1
            click_dict={"id":int(id_),"clicknum":clicknum,"clicktime":datetime.datetime.now()}
            ret=douyinrmforecast.updatebyparams(douyinrmforecast,douyinrmforecast,click_dict)
            if ret!=None:
                msg['code'] = crud_error_code
                msg['msg'] = ret
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_detail(request,id_):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}

        data =douyinrmforecast.getbyid(douyinrmforecast,douyinrmforecast, int(id_))
        if len(data)>0:
            msg['data']=data[0]
            if msg['data'].__contains__("reversetime"):
                if isinstance(msg['data']['reversetime'], datetime.datetime):
                    msg['data']['reversetime'] = msg['data']['reversetime'].strftime("%Y-%m-%d %H:%M:%S")
                else:
                    if msg['data']['reversetime'] != None:
                        reversetime = datetime.datetime.strptime(msg['data']['reversetime'], '%Y-%m-%d %H:%M:%S')
                        msg['data']['reversetime'] = reversetime.strftime("%Y-%m-%d %H:%M:%S")

        #浏览点击次数
        try:
            __browseClick__= douyinrmforecast.__browseClick__
        except:
            __browseClick__=None

        if __browseClick__=="是"   and  "clicknum"  in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast):
            try:
                clicknum=int(data[0].get("clicknum",0))+1
            except:
                clicknum=0+1
            click_dict={"id":int(id_),"clicknum":clicknum,"clicktime":datetime.datetime.now()}

            ret=douyinrmforecast.updatebyparams(douyinrmforecast,douyinrmforecast,click_dict)
            if ret!=None:
                msg['code'] = crud_error_code
                msg['msg'] = ret
        return JsonResponse(msg, encoder=CustomJsonEncoder)

def douyinrmforecast_update(request):
    '''
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")
        if 'clicktime' in req_dict.keys() and req_dict['clicktime']=="None":
            del req_dict['clicktime']
        if req_dict.get("mima") and "mima" not in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast) :
            del req_dict["mima"]
        if req_dict.get("password") and "password" not in douyinrmforecast.getallcolumn(douyinrmforecast,douyinrmforecast) :
            del req_dict["password"]
        try:
            del req_dict["clicknum"]
        except:
            pass


        error = douyinrmforecast.updatebyparams(douyinrmforecast, douyinrmforecast, req_dict)
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error

        return JsonResponse(msg)


def douyinrmforecast_delete(request):
    '''
    批量删除
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code, "data": {}}
        req_dict = request.session.get("req_dict")

        error=douyinrmforecast.deletes(douyinrmforecast,
            douyinrmforecast,
             req_dict.get("ids")
        )
        if error!=None:
            msg['code'] = crud_error_code
            msg['msg'] = error
        return JsonResponse(msg)


def douyinrmforecast_vote(request,id_):
    '''
    浏览点击次数(表属性[browseClick:是/否],点击字段(clicknum),调用info/detail接口的时候后端自动+1)、投票功能(表属性[vote:是/否],投票字段(votenum),调用vote接口后端votenum+1)
统计商品或新闻的点击次数;提供新闻的投票功能
    '''
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": mes.normal_code}


        data= douyinrmforecast.getbyid(douyinrmforecast, douyinrmforecast, int(id_))
        for i in data:
            votenum=i.get('votenum')
            if votenum!=None:
                params={"id":int(id_),"votenum":votenum+1}
                error=douyinrmforecast.updatebyparams(douyinrmforecast,douyinrmforecast,params)
                if error!=None:
                    msg['code'] = crud_error_code
                    msg['msg'] = error
        return JsonResponse(msg)

def douyinrmforecast_importExcel(request):
    if request.method in ["POST", "GET"]:
        msg = {"code": normal_code, "msg": "成功", "data": {}}

        excel_file = request.FILES.get("file", "")
        if excel_file.size > 100 * 1024 * 1024:  # 限制为 100MB
            msg['code'] = 400
            msg["msg"] = '文件大小不能超过100MB'
            return JsonResponse(msg)

        file_type = excel_file.name.split('.')[1]
        
        if file_type in ['xlsx', 'xls']:
            data = xlrd.open_workbook(filename=None, file_contents=excel_file.read())
            table = data.sheets()[0]
            rows = table.nrows
            
            try:
                for row in range(1, rows):
                    row_values = table.row_values(row)
                    req_dict = {}
                    douyinrmforecast.createbyreq(douyinrmforecast, douyinrmforecast, req_dict)
                    
            except:
                pass
                
        else:
            msg = {
                "msg": "文件类型错误",
                "code": 500
            }
                
        return JsonResponse(msg)

def douyinrmforecast_autoSort2(request):
    return JsonResponse({"code": 0, "msg": '',  "data":{}})

系统测试

黑盒测试,顾名思义,它摒弃了对内部程序构造和逻辑的探究,其核心在于验证系统的外部行为是否符合预设的需求规格。测试者关注的是输入引发的预期输出,而非内部工作流程,如进行功能验证、接口协调和系统整合测试,以此确保用户界面的正确无误。

相对而言,白盒测试则采取更为深入的视角,它深入系统的内部结构,着重检验程序逻辑的完整性以及代码执行的覆盖率。这种方法旨在揭露隐藏的编程错误,通过细致的逻辑路径分析,确保代码的可靠性和健壮性。白盒测试通常包括单元测试、集成测试和系统测试,旨在发现并修复代码中的逻辑错误和潜在缺陷。

结论

算法模型方面,随机森林回归算法表现优异。其综合用户多元特征,精准预测用户对短视频的兴趣程度,有效挖掘用户兴趣偏好。同时,结合聚类、关联规则等算法,从不同角度剖析用户兴趣,构建出更全面、准确的用户兴趣模型,为短视频平台的个性化推荐提供有力支撑。可视化与系统实现上,ECharts将复杂分析结果转化为直观易懂的图表,涵盖点赞、评论、收藏等多维度数据,方便平台运营者及相关人员洞察用户兴趣动态。基于Django搭建的Web应用系统,集成数据处理、兴趣建模、算法预测等功能,为短视频用户兴趣分析提供了完整、高效的解决方案。

相关推荐
汝生淮南吾在北1 小时前
SpringBoot+Vue游戏攻略网站
前端·vue.js·spring boot·后端·游戏·毕业设计·毕设
Mxsoft6192 小时前
我发现Flink事件时间窗口对齐,解决实时巡检数据延迟救场!
大数据·flink
hg01183 小时前
中企助力莫桑比克纳卡拉走廊物流体系全面提升
大数据
外参财观3 小时前
会员制大考:盒马交卷离场,山姆答题艰难
大数据·人工智能
张人玉3 小时前
大数据hadoop系列——在ubuntu上安装HBase 伪分布式
大数据·hadoop·分布式·hbase
Arva .3 小时前
介绍一下你知道的锁
大数据
檐下翻书1733 小时前
集团组织架构图在线设计 多部门协作编辑工具
大数据·论文阅读·人工智能·物联网·架构·流程图·论文笔记
霸王大陆3 小时前
《零基础学PHP:从入门到实战》教程-模块八:面向对象编程(OOP)入门-5
开发语言·笔记·php·课程设计
小王毕业啦4 小时前
2008-2023年 全国统一大市场发展水平
大数据·人工智能·数据挖掘·数据分析·数据统计·社科数据·实证数据