The first stage : Big data foundation Java Language foundation stage

1.1:Java Introduction to development

1.1.1 Java The development history of

1.1.2 Java Field of application

1.1.3 Java The characteristics of language

1.1.4 Java object-oriented

1.1.5 Java Performance classification

1.1.6 build Java Environmental Science

1.1.7 Java working principle

 

1.2: be familiar with Eclipse development tool

1.2.1 Eclipse Introduction and download

1.2.2 install Eclipse Chinese language pack for

1.2.3 Eclipse Configuration and startup of

1.2.4 Eclipse Workbench and view

1.2.5 “ Package Explorer ” view

1.2.6 use Eclipse

1.2.7 Using editor to write program code

 

1.3:Java Language foundation

1.3.1 Java Main class structure

1.3.2 Basic data type

1.3.3 Variables and constants

1.3.4 Java operator

1.3.5 Data type conversion

1.3.6 Code annotation and coding specification

1.3.7 Java Help documentation

 

1.4:Java Process control

1.4.1 Compound statement

1.4.2 Conditional statement

1.4.3 if Conditional statement

1.4.4 switch Multi branch statement

1.4.5 while Loop statement

1.4.6 do…while Loop statement

1.4.7 for Loop statement

 

1.5:Java character string

1.5.1 String class

1.5.2 Connection string

1.5.3 Get string information

1.5.4 String operation

1.5.5 format string

1.5.6 Using regular expressions

1.5.7 String generator

 

1.6:Java Arrays and classes and objects

1.6.1 Array overview

1.6.2 Creation and use of one dimensional array

1.6.3 Creation and use of two dimensional array

1.6.4 Basic operation of array

1.6.5 Array sorting algorithm

1.6.6 Java Class and construction method of

1.6.7 Java Object of , Attributes and behaviors

 

1.7: Digital processing class and core technology

1.7.1 Number format and operation

1.7.2 random number And big data operation

1.7.3 Class inheritance and Object class

1.7.4 Conversion of object types

1.7.5 use instanceof The operator judges the object type

1.7.6 Overload and polymorphism of methods

1.7.7 Abstract class and interface

 

1.8:I/O And reflection , Multithreading

1.8.1 Flow overview and File class

1.8.2 file input / Output stream

1.8.3 cache input / Output stream

1.8.4 Class Class and Java reflex

1.8.5 Annotation Function type information

1.8.6 Enumeration types and generics

1.8.7 establish , Operation thread and thread safety

 

1.9:Swing Program and set class

1.9.1 Common forms

1.9.2 Label components and icons

1.9.3 Common layout manager And panel

1.9.4 Button assembly And list components

1.9.5 Common event listeners

1.9.6 Overview of collection class

1.9.7 Set aggregate And Map Collection and interface

 

1.10:PC End site layout

1.10.1 HTML Basics ,CSS Basics ,CSS Core attributes

1.10.2 CSS Style cascade , inherit , Box model

1.10.3 container , Overflow and element type

1.10.4 Browser compatibility and width height adaptation

1.10.5 location , Anchor and transparency

1.10.6 Image integration

1.10.7 form ,CSS Attributes and filters

1.10.8 CSS optimization

 

1.11:HTML5+CSS3 Basics

1.11.1 HTML5 New elements and attributes

1.11.2 CSS3 selector

1.11.3 Text font related styles

1.11.4 CSS3 Displacement and deformation treatment

1.11.5 CSS3 2D,3D Transformation and animation

1.11.6 Elastic box model

1.11.7 Media inquiry

1.11.8 Responsive design

 

1.12:WebApp Page layout project

1.12.1 Mobile page design specification

1.12.2 Mobile end cut diagram

1.12.3 Text streaming / Control elasticity / Equal scale layout of pictures

1.12.4 Scale layout proportionally

1.12.5 viewport/meta

1.12.6 rem/vw Use of

1.12.7 flexbox Detailed explanation

1.12.8 move web Special style processing

 

1.13: Primordial JavaScript Function development

1.13.1 What is? JavaScript

1.13.2 JavaScript Principle of use and operation

1.13.3 JavaScript Basic grammar

1.13.4 JavaScript Built in objects

1.13.5 event , Event principle

1.13.6 JavaScript Basic special effects production

1.13.7 cookie storage

1.13.8 regular expression

 

1.14:Ajax Asynchronous interaction

1.14.1 Ajax Overview and features

1.14.2 Ajax working principle

1.14.3 XMLHttpRequest object

1.14.4 Synchronous and asynchronous

1.14.5 Ajax Asynchronous interaction

1.14.6 Ajax Cross domain issues

1.14.7 Ajax Data processing

1.14.8 be based on WebSocket Real time interaction with push

 

1.15:JQuery application

1.15.1 Use and application optimization of each selector

1.15.2 Dom Various operations of nodes

1.15.3 event processing , encapsulation , application

1.15.4 jQuery The use of various kinds of animation in

1.15.5 Development of usability form

1.15.6 jQuery Ajax, function , cache

1.15.7 jQuery Writing plug-ins , extend , application

1.15.8 Understanding modular development and Application

 

1.16: database

1.16.1 Mysql database

1.16.2 JDBC development

1.16.3 Connection pool and DBUtils

1.16.4 Oracle introduce

1.16.5 MongoDB Introduction to database

1.16.6 apache The server /Nginx The server

1.16.7 Memcached Memory object cache system

 

1.17:JavaWeb Development core

1.17.1 XML technology

1.17.2 HTTP agreement

1.17.3 Servlet Working principle analysis

1.17.4 In depth understanding Session And Cookie

1.17.5 Tomcat System architecture and design pattern of

1.17.6 JSP Syntax and built-in objects

1.17.7 JDBC technology

1.17.8 Static architecture design of large traffic system

 

1.18:JavaWeb Development inside story

1.18.1 In depth understanding Web Request process

1.18.2 Java I/O Working mechanism of

1.18.3 Java Web Chinese coding

1.18.4 Javac Principles of compilation

1.18.5 class file structure

1.18.6 ClassLoader Working mechanism

1.18.7 JVM Architecture and working mode

1.18.8 JVM memory management

 

The second stage :Linux system Hadoop Ecosystem

2.1:Linux system (1)

2.1.1 VMware Workstation Virtual software installation process ,CentOS Virtual machine installation process

2.1.2 Understanding rack servers , Real rack server deployment linux

2.1.3 Linux Common commands for : Introduction of common commands , Use and practice of common commands

2.1.4 Linux Basic principles of system process management and related management tools such as ps,pkill,top,htop Use of etc

 

2.1:Linux system (2)

2.1.5 Linux Start process , Operation level details ,chkconfig Detailed explanation

2.1.6 VI,VIM editor :VI,VIM Introduction to the editor ,VI,VIM Use and common shortcut keys

2.1.7 Linux User and group account management : User management , Group management

2.1.8 Linux Disk management ,lvm Logical volume ,nfs Detailed explanation

 

2.1:Linux system (3)

2.1.9 Linux System file authority management : Introduction to file authority , Operation of file permission

2.1.10 Linux Of RPM Software package management :RPM Package introduction ,RPM install , Unloading and other operations

2.1.11 yum command ,yum Source building

2.1.12 Linux network :Linux Introduction to network ,Linux Network configuration and maintenance

 

2.1:Linux system (4)

2.1.13 Shell programming :Shell Introduction to ,Shell Script writing

2.1.14 Linux Installation of common software on : install JDK, install Tomcat, install mysql,web Project deployment

 

2.2:Hadoop Outline of offline calculation (1)

2.2.1 Hadoop Introduction to ecological environment

2.2.2 Hadoop Location and relationship in cloud computing

2.2.3 at home and abroad Hadoop Application case introduction

2.2.4 Hadoop concept , edition , history

2.2.5 Hadoop Introduction to core components and hdfs,mapreduce Architecture

2.2.6 Hadoop Cluster structure of

2.2.7 Hadoop Detailed installation steps of pseudo distribution

 

2.2:Hadoop Outline of offline calculation (2)

2.2.8 Observation through command line and browser hadoop

2.2.9 HDFS bottom && datanode,namenode Detailed explanation &&shell&&Hdfs java api

2.2.10 Mapreduce Introduction of four stages

2.2.11 Writable

2.2.12 InputSplit and OutputSplit

2.2.13 Maptask

2.2.14 Shuffle:Sort,Partitioner,Group,Combiner

 

2.2:Hadoop Outline of offline calculation (3)

2.2.15 Reducer

2.2.16 Mapreducer case :1) Secondary sorting

2.2.17 Inverted index

2.2.18 optimal path

2.2.19 Telecom Data Mining ----- Prediction and analysis of moving track ( China prism project )

2.2.20 Social friend recommendation algorithm

2.2.21 Accurate advertising push on the Internet algorithm

 

2.2:Hadoop Outline of offline calculation (4)

2.2.22 Alibaba Tianchi big data competition 《 Tmall recommendation algorithm 》

2.2.23 Mapreduce actual combat pagerank algorithm

2.2.24 Hadoop2.x Introduction of cluster structure system

2.2.25 Hadoop2.x Cluster building

2.2.26 NameNode High availability of (HA)

2.2.27 HDFS Federation

 

2.2:Hadoop Outline of offline calculation (5)

2.2.28 ResourceManager High availability of (HA)

2.2.29 Hadoop Cluster common problems and Solutions

2.2.30 Hadoop Cluster management

 

2.3: Distributed database Hbase(1)

2.3.1 Hbase brief introduction

2.3.2 HBase And RDBMS Comparison of

2.3.3 data model

2.3.4 system architecture

2.3.5 HBase On the MapReduce

2.3.6 Table design

2.3.7 Explanation of cluster building process

2.3.8 Cluster monitoring

 

2.3: Distributed database Hbase(2)

2.3.9 Cluster management

2.3.10 HBase Shell And demonstrations

2.3.11 Hbase Tree table design

2.3.12 Hbase One to many and Many to many Table design

2.3.13 Hbase micro-blog case

2.3.14 Hbase Order case

2.3.15 Hbase Table level optimization

 

2.3: Distributed database Hbase(3)

2.3.16 Hbase Write data optimization

2.3.17 Hbase Read data optimization

2.3.18 Hbase API operation

2.3.19 hbase mapdreduce and hive integration

 

2.4: data warehouse Hive(1)

2.4.1 Basic knowledge of data warehouse

2.4.2 Hive definition

2.4.3Hive Introduction to architecture

2.4.4 Hive colony

2.4.5 Client profile

2.4.6 HiveQL definition

2.4.7 HiveQL And SQL Comparison of

2.4.8 data type

 

2.4: data warehouse Hive(2)

2.4.9 External and partitioned tables

2.4.10 ddl And CLI Client demo

2.4.11 dml And CLI Client demo

2.4.12 select And CLI Client demo

2.4.13 Operators and functions And CLI Client demo

2.4.14 Hive server2 And jdbc

 

2.4: data warehouse Hive(3)

2.4.15 User defined function (UDF and UDAF) Development and demonstration of

2.4.16 Hive optimization

2.4.17 serde

 

2.5: Data migration tools Sqoop

2.5.1 Sqoop Introduction and use

2.5.2 Sqoop shell use

2.5.3 Sqoop-import

2.5.4 DBMS-hdfs

2.5.5 DBMS-hive

2.5.6 DBMS-hbase

2.5.7 Sqoop-export

 

2.6:Flume Distributed logging framework (1)

2.6.1 flume brief introduction - Basic knowledge 2.6.2 flume Installation and testing

2.6.3 flume Deployment mode

2.6.4 flume source Related configuration and test

2.6.5 flume sink Related configuration and test

2.6.6 flume selector Related configuration and case analysis

2.6.7 flume Sink Processors Relevant configuration and case analysis

 

2.6:Flume Distributed logging framework (2)

2.6.8 flume Interceptors Relevant configuration and case analysis

2.6.9 flume AVRO Client development

2.6.10 flume and kafka Integration of

 

The third stage : Distributed computing framework :Spark&Storm Ecosystem

3.1:Scala programing language (1)

3.1.1 scala interpreter , variable , Common data types

3.1.2 scala Conditional expression for , Input and output , Control structure such as circulation

3.1.3 scala Function of , Default parameters , Variable length parameters, etc

3.1.4 scala Array of , Variable length array , Multidimensional array, etc

3.1.5 scala Mapping of , Tuple and other operations

3.1.6 scala Class , include bean attribute , Auxiliary constructor , Main constructors, etc

 

3.1:Scala programing language (2)

3.1.7 scala Object of , Singleton object , Companion Object , Extension class ,apply And so on

3.1.8 scala Bag of , introduce , Inheritance and other concepts

3.1.9 scala Characteristics of

3.1.10 scala Operator for

3.1.11 scala Higher order function of

3.1.12 scala Set of

3.1.13 scala Database connection

 

3.2:Spark Big data processing (1)

3.2.1 Spark introduce

3.2.2 Spark Application scenarios

3.2.3 Spark and Hadoop MR,Storm Comparison and advantages of

3.2.4 RDD

3.2.5 Transformation

3.2.6 Action

3.2.7 Spark calculation PageRank

 

3.2:Spark Big data processing (2)

3.2.8 Lineage

3.2.9 Spark Introduction to the model

3.2.10 Spark Cache policy and fault tolerance

3.2.11 Wide dependence and narrow dependence

3.2.12 Spark Configuration explanation

3.2.13 Spark Cluster building

3.2.15 Solutions to common problems in cluster building

3.2.16 Spark Principle core components and common RDD

 

3.2:Spark Big data processing (3)

3.2.17 Data locality

3.2.18 task scheduling

3.2.19 DAGScheduler

3.2.20 TaskScheduler

3.2.21 Spark Source code interpretation

3.2.22 performance tuning

3.2.23 Spark and Hadoop2.x integration :Spark on Yarn principle

 

3.3:Spark—Streaming Real time processing of big data

3.3.1 Spark Streaming: Data sources and DStream

3.3.2 Stateless transformation With state transformation

3.3.3 Streaming Window Operation of

3.3.4 sparksql Programming practice

3.3.5 spark Multi language operation of

3.3.6 spark What's new in the latest version

 

3.4:Spark—Mlib machine learning (1)

3.4.1 Mlib brief introduction

3.4.2 Spark MLlib Introduction to components

3.4.3 Basic data type

3.4.4 Regression algorithm

3.4.5 Generalized linear model

3.4.6 logistic regression

3.4.7 Classification algorithm

3.4.8 Naive Bayes

 

3.4:Spark—Mlib machine learning (2)

3.4.9 Decision tree

3.4.10 Random forest

3.4.11 Recommendation system

3.4.12 clustering

a) Kmeans b) Sparse kmeans

c) Kmeans++ d) Kmeans II

e) Streaming kmeans

f) Gaussian Mixture Model

 

3.5:Spark—GraphX Graph calculation

3.5.1 Bipartite graph

3.5.2 summary

3.5.3 Structural map

3.5.4 Attribute map

3.5.5 PageRank

 

3.6:storm Technical architecture system (1)

3.6.1 Project technical framework system

3.6.2 Storm What is it?

3.6.3 Storm Architecture analysis

3.6.4 Storm Programming model ,Tuple Source code , Concurrency analysis

3.2.5 Transformation

 

3.6:storm Technical architecture system (2)

3.6.6 Maven Rapid construction of environment

3.6.7 Storm WordCount Cases and common use Api

3.6.8 Storm+Kafka+Redis Business index calculation

3.6.9 Storm Cluster installation and deployment

3.6.10 Storm Source code download and compile

 

3.7:Storm Principle and foundation (1)

3.7.1 Storm Cluster startup and source code analysis

3.7.2 Storm Task submission and source code analysis

3.7.3 Storm Analysis of data sending process

3.7.4 Strom Analysis of communication mechanism

3.7.5 Storm Message fault tolerance mechanism and source code analysis

3.7.6 Storm many stream Project analysis

3.7.7 Storm Trident And sensor data

 

3.7:Storm Principle and foundation (2)

3.7.8 Real time trend analysis

3.8.9 Storm DRPC( Distributed remote call ) introduce

3.7.10 Storm DRPC Practical explanation

3.7.11 Write your own flow task execution framework

 

3.8: Message queuing kafka

3.8.1 What is message queuing

3.8.2 kafka Core components

3.8.3 kafka Cluster deployment and common commands

3.8.4 kafka Configuration file sorting

3.8.5 kafka JavaApi study

3.8.6 kafka Analysis of file storage mechanism

3.8.7 kafka Distribution and subscription of

3.8.8 kafka use zookeeper Coordinate management

 

3.9:Redis tool

3.9.1 nosql introduce

3.9.2 redis introduce

3.9.3 redis install

3.9.4 Client connection

3.9.5 redis Data function of

3.9.6 redis Persistence

3.9.7 redis Application cases

 

3.10:zookeeper Detailed explanation

3.10.1 zookeeper brief introduction

3.10.2 zookeeper Cluster deployment for

3.10.3 zookeeper The core working mechanism of

3.10.4 zookeeper Command line operations for

3.10.5 zookeeper Client for API

3.10.6 zookeeper Application cases of

3.10.7 zookeeper Principle supplement

 

The fourth stage : Big data project

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (1)

4.1.1 Project introduction (1)

Log analysis and order management of Taobao website Middle school learning , There are many technical points , A visitor (UV) Click enter to calculate a flow , There are also Views (PV) A visitor (UV)
Number of visits in the store . One UV Minimum yield Give birth to one PV,PV/UV It's called the visit depth , A visitor

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (2)

4.1.1 Project introduction (2)

(UV) Number of visits in the store . One UV Minimum yield Give birth to one PV,PV/UV It's called the visit depth , A visitor (UV) Click to enter After the calculation of a flow , There's also browsing
amount (PV) A visitor (UV) Number of visits in the store . One UV Minimum production One PV,PV/UV It's commonly known as the depth of access

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (3)

4.1.1 Project introduction (3)

Influence natural ranking natural search called weight , Weight is to determine whether a product ranks first The decisive factor in getting more traffic , Weighted There are dozens of forms , Generally, there are
sales volume , Favorable comments , Collection ,DSR, Maintenance time , Off the shelf time and so on .

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (4)

4.1.2 Project features

How to actually use these points is our self-study Not experienced in the process .Cookie journal The analysis includes :pv,uv, Jump rate , Two jumps rate , Advertising conversion rate , Search engine optimization, etc ,
The order module has : Product recommendation , Business ranking , Historical order query , Order report statistics, etc .

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (5)

4.1.3 Project architecture

SDK(JavaaSDK,JSSDK)+

lvs+nginx colony +flume+

hdfs2.x+hive+hbase+MR+MySQL

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (6)

4.1.4 Project process (1)

a) Data acquisition :Web Projects and cloud computing items Purpose integration

b) data processing :Flume adopt avro real Time collection web Log in project

c) Data ETL

d) Data exhibition storage :Hive batch sql implement e) Hive Custom function

 

4.1: Alibaba's Taobao e-commerce big data traffic analysis platform (7)

4.1.4 Project process (2)

f) Hive and hbase integration .

g) Hbase Data support sql Query analysis

h) Data analysis : data Mapreduce number According to the excavation

i) Hbase dao handle

j) Sqoop Use in projects .

k) Data visualization :Mapreduce timing Call and monitor

 

4.2: Actual combat 1 :Sina Microblog based on Spark Recommendation system for (1)

4.2.1 Project introduction (1)

Personalized recommendation is based on the interest characteristics of users And buying behavior , Recommend users to be interested Interest in information and products . With the development of e-commerce The continuous expansion of modules , The number and variety of goods are fast
Rapid growth , Customers need to spend a lot of time To find what you want to buy . This kind of Liu Read a lot of irrelevant information and product process Will make submerged in information overload

 

4.2: Actual combat 1 :Sina Microblog based on Spark Recommendation system for (2)

4.2.1 Project introduction (2)

Consumers in question are constantly losing . In order to solve these problems , Personalized recommendation system came into being . individualization Recommendation system is based on massive data mining An advanced business intelligence platform based on , with
Help e-commerce websites to shop for their customers For complete personalized decision support and information services

 

4.2: Actual combat 1 :Sina Microblog based on Spark Recommendation system for (3)

4.2.2 term 目特色(1)

推荐系统是个复杂的系统工程, 依赖工程,架构,算法的有机结 合,是数据挖掘技术,信息检索 技术,计算统计学的智慧结晶, 学员只有亲手动手才能体会推荐
系统的各个环节,才能对各种推 荐算法的优缺点有真实的感受. 一方面可以很熟练的完成简单的

 

4.2:实战一:Sina微博基于Spark的推荐系统(4)

4.2.2 项目特色(2)

推荐算法,如content-based,

item-based CF 等.另一方面

要掌握一些常见的推荐算法库,

如:SvdFeature,LibFM,

Mathout,Mlib等.

 

4.2:实战一:Sina微博基于Spark的推荐系统(5)

4.2.3 项目技术架构体系(1)

a) 实时流处理 Kafka,Spark Streaming

b) 分布式运算 Hadoop,Spark

c) 数据库 Hbase,Redis

d) 机器学习 Spark Mllib

e) 前台web展示数据 Struts2, echart

 

4.2:实战一:Sina微博基于Spark的推荐系统(6)

4.2.3 项目技术架构体系(2)

f) 分布式平台 Hadoop,Spark

g) 数据清洗 Hive

h) 数据分析 R RStudio

i) 推荐服务 Dubbox

j) 规则过滤 Drools

k) 机器学习 MLlib

 

4.3:实战二:Sina门户的DSP广告投放系统(1)

4.3.1 项目介绍

新浪网(www.sina.com.cn),

是知名的门户网站,该项目主要通

过收集新浪的Cookie每个产生的日

志,分析统计出该网站的流量相关

信息和竞价广告位

 

4.3:实战二:Sina门户的DSP广告投放系统(2)

4.3.2 项目特色

在互联网江湖中,始终流传着三大 赚钱法宝:广告,游戏,电商,在 移动互联网兴起之际,利用其得天 独厚的数据优势,终于能够回答困
扰了广告主几百年的问题:我的广 告究竟被谁看到了?浪费的一半的 钱到底去了哪里?

 

 

4.3:实战二:Sina门户的DSP广告投放系统(3)

4.3.3 项目技术架构体系(1)

a)通过flume把日志数据导入到 HDFS中,使用hive进行数据清洗 b)提供web视图供用户使用,输入 查询任务参数,写入MySQL
c)使用spark根据用户提交的任 务参数,进行session分析,进 行单挑率分析

 

4.3:实战二:Sina门户的DSP广告投放系统(4)

4.3.3 项目技术架构体系(2)

d)使用spark sql进行各类型热 门广告统计 e)使用 flume将广告点击日志传 入kafka,使用spark streaming
进行广告点击率的统计 f)web页面显示MySQL中存储的任务 执行结果

 

4.4:实战三:商务日志告警系统项目(1)

4.4.1 项目介绍(1)

基于的日志进行监控,监控需要一定规 则,对触发监控规则的日志信息进行告 警,告警的方式,是短信和邮件,随着 公司业务发展,支撑公司业务的各种系
统越来越多,为了保证公司的业务正常 发展,急需要对这些线上系统的运行进

 

 

4.4:实战三:商务日志告警系统项目(2)

4.4.1 项目介绍(2)

行监控,做到问题的及时发现和处理, 最大程度减少对业务的影响.

4.4.2 项目特色(1)

整体架构设计很完善, 主要架构为应 用 a)应用程序使用log4j产生日志

b)部署flume客户

 

4.4:实战三:商务日志告警系统项目(3)

4.4.2 项目特色(2)

端监控应用程序产生的日志信息,并发送到kafka集群中

c)storm spout拉去kafka的数据进 行消费,逐条过滤每条日志的进行规 则判断,对符合规则的日志进行邮件 告警.

 

4.4:实战三:商务日志告警系统项目(4)

4.4.2 项目特色(3)

d)最后将告警的信息保存到mysql数 据库中,用来进行管理.

4.4.3 项目技术架构体系

a)推荐系统基础知识 b)推荐系统开发流程分析 c)mahout协同过滤Api使用 d)Java推荐引擎开发实战 e)推荐系统集成运行

 

4.5:实战四:互联网猜你喜欢推荐系统实战(1)

4.5.1 项目介绍(1)

到网上购物的人已经习惯了收到系统为 他们做出的个性化推荐.Netflix 会推 荐你可能会喜欢看的视频.TiVo会自动 把节目录下来,如果你感兴趣就可以看.
Pandora会通过预测我们想要听什么歌 曲从而生成个性化的音乐流.所有这些

 

4.5:实战四:互联网猜你喜欢推荐系统实战(2)

4.5.1 项目介绍(2)

推荐结果都来自于各式各样的推荐系统. 它们依靠计算机算法运行,根据顾客的 浏览,搜索,下单和喜好,为顾客选择 他们可能会喜欢,有可能会购买的商品,
从而为消费者服务.推荐系统的设计初 衷是帮助在线零售商提高销售额,现在 这是一块儿规模巨大且

 

4.5:实战四:互联网猜你喜欢推荐系统实战(3)

4.5.1 项目介绍(3)

不断增长的业务.与此同时,推荐系统的开发也已经 从上世纪 90 年代中期只有几十个人研 究,发展到了今天拥有数百名研究人员,
分别供职于各高校,大型在线零售商和 数十家专注于这类系统的其他企业.

 

4.5:实战四:互联网猜你喜欢推荐系统实战(4)

4.5.2 项目特色(1)

有没有想过自己在亚马逊眼中是什么 样子?答案是:你是一个很大,很大 的表格里一串很长的数字.这串数字 描述了你所看过的每一样东西,你点
击的每一个链接以及你在亚马逊网站 上买的每一件商品;表格里的其余部

 

4.5:实战四:互联网猜你喜欢推荐系统实战(5)

4.5.2 项目特色(2)

分则代表了其他数百万到亚马逊购 物的人.你每次登陆网站,你的数字 就会发生改变;在此期间,你在网站 上每动一下,这个数字就会跟着改变.
这个信息又会反过来影响你在访问的 每个页面上会看到什么,还有你会从 亚马逊公司收到什么邮件和优惠信息.

 

4.5:实战四:互联网猜你喜欢推荐系统实战(6)

4.5.3 项目技术架构体系

a)推荐系统基础知识

b)推荐系统开发流程分析

c)mahout协同过滤Api使用

d)Java推荐引擎开发实战

e)推荐系统集成运行

 

 

第五阶段:大数据分析方向AI(人工智能)

5.1 Python编程&&Data Analyze工作环境准备&数据分析基础(1)

5.1.1介绍Python以及特点

5.1.2 Python的安装

5.1.3 Python基本操作(注释,逻辑, 字符串使用等)

5.1.4 Python数据结构(元组,列表,字典)

 

5.1 Python编程&&Data Analyze工作环境准备&数据分析基础(2)

5.1.5 使用Python进行批量重命名小例子

5.1.6 Python常见内建函数

5.1.7 更多Python函数及使用常见技巧

5.1.8 异常

5.1.9 Python函数的参数讲解

5.1.10 Python模块的导入

 

5.1 Python编程&&Data Analyze工作环境准备&数据分析基础(3)

5.1.11 Python中的类与继承

5.1.12 网络爬虫案例

5.1.13 数据库连接,以及pip安装模块

5.1.14 Mongodb基础入门

5.1.15 讲解如何连接mongodb

5.1.16 Python的机器学习案例

 

5.1 Python编程&&Data Analyze工作环境准备&数据分析基础(4)

5.1.17 AI&&机器学习&&深度学习概论

5.1.18 工作环境准备

5.1.19 数据分析中常用的Python技巧

5.1.20 Pandas进阶及技巧

5.1.21 数据的统计分析

 

5.2:数据可视化

5.2.1 数据可视化的概念

5.2.2 图表的绘制及可视化

5.2.3 动画及交互渲染

5.2.4 数据合并,分组

 

5.3:Python机器学习-1(1)

5.3.1 机器学习的基本概念

5.3.2 ML工作流程

5.3.3 Python机器学习库scikit-learn

5.3.4 KNN模型

5.3.5 线性回归模型

5.3.6 逻辑回归模型

5.3.7 支持向量机模型

 

5.3:Python机器学习-1(2)

5.3.8 决策树模型

5.3.9 超参数&&学习参数

 

 

5.4:Python机器学习-2

5.4.1 模型评价指标

5.4.2 交叉验证

5.4.3 机器学习经典算法

5.4.4 朴素贝叶斯

5.4.5 随机森林

5.4.6 GBDT

 

5.5:图像识别&&神经网络

5.5.1 图像操作的工作流程

5.5.2 特征工程

5.5.3 图像特征描述

5.5.4 AI网络的描述

5.5.5 深度学习

5.5.6 TensorFlow框架学习

5.5.7 TensorFlow框架卷积神经网络(CNN)

 

5.6:自然语言处理&&社交网络处理

5.6.1 Python文本数据处理

5.6.2 自然语言处理及NLTK

5.6.3 主题模型

5.6.4 LDA

5.6.5 图论简介

5.6.6 网络的操作及数据可视化

 

5.7:实战项目