Ansys|91国内精品视频|Matlab|91国内精品久久久|R语言培训课程班-91国内精品久久-曙海培训深圳成都南京苏州杭州

課程目錄:Big Data Business Intelligence for Govt. Agencies培訓
4401 人關注
(78637/99817)
課程大綱:

         Big Data Business Intelligence for Govt. Agencies培訓

 

 

 

Each session is 2 hours
Day-1: Session -1: Business Overview of Why Big Data Business Intelligence in Govt.
Case Studies from NIH, DoE
Big Data adaptation rate in Govt. Agencies & and how they are aligning their future operation around Big Data Predictive Analytics
Broad Scale Application Area in DoD, NSA, IRS, USDA etc.
Interfacing Big Data with Legacy data
Basic understanding of enabling technologies in predictive analytics
Data Integration & Dashboard visualization
Fraud management
Business Rule/ Fraud detection generation
Threat detection and profiling
Cost benefit analysis for Big Data implementation
Day-1: Session-2 : Introduction of Big Data-1
Main characteristics of Big Data-volume, variety, velocity and veracity. MPP architecture for volume.
Data Warehouses – static schema, slowly evolving dataset
MPP Databases like Greenplum, Exadata, Teradata, Netezza, Vertica etc.
Hadoop Based Solutions – no conditions on structure of dataset.
Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS
Batch- suited for analytical/non-interactive
Volume : CEP streaming data
Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc)
Less production ready – Storm/S4
NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database
Day-1 : Session -3 : Introduction to Big Data-2
NoSQL solutions
KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)
KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB
KV Store (Hierarchical) - GT.m, Cache
KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord
KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua
Tuple Store - Gigaspaces, Coord, Apache River
Object Database - ZopeDB, DB40, Shoal
Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris
Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
Varieties of Data: Introduction to Data Cleaning issue in Big Data
RDBMS – static structure/schema, doesn’t promote agile, exploratory environment.
NoSQL – semi structured, enough structure to store data without exact schema before storing data
Data cleaning issues
Day-1 : Session-4 : Big Data Introduction-3 : Hadoop
When to select Hadoop?
STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active exploration)
SEMI STRUCTURED data – tough to do with traditional solutions (DW/DB)
Warehousing data = HUGE effort and static even after implementation
For variety & volume of data, crunched on commodity hardware – HADOOP
Commodity H/W needed to create a Hadoop Cluster
Introduction to Map Reduce /HDFS
MapReduce – distribute computing over multiple servers
HDFS – make data available locally for the computing process (with redundancy)
Data – can be unstructured/schema-less (unlike RDBMS)
Developer responsibility to make sense of data
Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS
Day-2: Session-1: Big Data Ecosystem-Building Big Data ETL: universe of Big Data Tools-which one to use and when?
Hadoop vs. Other NoSQL solutions
For interactive, random access to data
Hbase (column oriented database) on top of Hadoop
Random access to data but restrictions imposed (max 1 PB)
Not good for ad-hoc analytics, good for logging, counting, time-series
Sqoop - Import from databases to Hive or HDFS (JDBC/ODBC access)
Flume – Stream data (e.g. log data) into HDFS
Day-2: Session-2: Big Data Management System
Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services
Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain
Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari
In Cloud : Whirr
Day-2: Session-3: Predictive analytics in Business Intelligence -1: Fundamental Techniques & Machine learning based BI :
Introduction to Machine learning
Learning classification techniques
Bayesian Prediction-preparing training file
Support Vector Machine
KNN p-Tree Algebra & vertical mining
Neural Network
Big Data large variable problem -Random forest (RF)
Big Data Automation problem – Multi-model ensemble RF
Automation through Soft10-M
Text analytic tool-Treeminer
Agile learning
Agent based learning
Distributed learning
Introduction to Open source Tools for predictive analytics : R, Rapidminer, Mahut
Day-2: Session-4 Predictive analytics eco-system-2: Common predictive analytic problems in Govt.
Insight analytic
Visualization analytic
Structured predictive analytic
Unstructured predictive analytic
Threat/fraudstar/vendor profiling
Recommendation Engine
Pattern detection
Rule/Scenario discovery –failure, fraud, optimization
Root cause discovery
Sentiment analysis
CRM analytic
Network analytic
Text Analytics
Technology assisted review
Fraud analytic
Real Time Analytic
Day-3 : Sesion-1 : Real Time and Scalable Analytic Over Hadoop
Why common analytic algorithms fail in Hadoop/HDFS
Apache Hama- for Bulk Synchronous distributed computing
Apache SPARK- for cluster computing for real time analytic
CMU Graphics Lab2- Graph based asynchronous approach to distributed computing
KNN p-Algebra based approach from Treeminer for reduced hardware cost of operation
Day-3: Session-2: Tools for eDiscovery and Forensics
eDiscovery over Big Data vs. Legacy data – a comparison of cost and performance
Predictive coding and technology assisted review (TAR)
Live demo of a Tar product ( vMiner) to understand how TAR works for faster discovery
Faster indexing through HDFS –velocity of data
NLP or Natural Language processing –various techniques and open source products
eDiscovery in foreign languages-technology for foreign language processing
Day-3 : Session 3: Big Data BI for Cyber Security –Understanding whole 360 degree views of speedy data collection to threat identification
Understanding basics of security analytics-attack surface, security misconfiguration, host defenses
Network infrastructure/ Large datapipe / Response ETL for real time analytic
Prescriptive vs predictive – Fixed rule based vs auto-discovery of threat rules from Meta data
Day-3: Session 4: Big Data in USDA : Application in Agriculture
Introduction to IoT ( Internet of Things) for agriculture-sensor based Big Data and control
Introduction to Satellite imaging and its application in agriculture
Integrating sensor and image data for fertility of soil, cultivation recommendation and forecasting
Agriculture insurance and Big Data
Crop Loss forecasting
Day-4 : Session-1: Fraud prevention BI from Big Data in Govt-Fraud analytic:
Basic classification of Fraud analytics- rule based vs predictive analytics
Supervised vs unsupervised Machine learning for Fraud pattern detection
Vendor fraud/over charging for projects
Medicare and Medicaid fraud- fraud detection techniques for claim processing
Travel reimbursement frauds
IRS refund frauds
Case studies and live demo will be given wherever data is available.
Day-4 : Session-2: Social Media Analytic- Intelligence gathering and analysis
Big Data ETL API for extracting social media data
Text, image, meta data and video
Sentiment analysis from social media feed
Contextual and non-contextual filtering of social media feed
Social Media Dashboard to integrate diverse social media
Automated profiling of social media profile
Live demo of each analytic will be given through Treeminer Tool.
Day-4 : Session-3: Big Data Analytic in image processing and video feeds
Image Storage techniques in Big Data- Storage solution for data exceeding petabytes
LTFS and LTO
GPFS-LTFS ( Layered storage solution for Big image data)
Fundamental of image analytics
Object recognition
Image segmentation
Motion tracking
3-D image reconstruction
Day-4: Session-4: Big Data applications in NIH:
Emerging areas of Bio-informatics
Meta-genomics and Big Data mining issues
Big Data Predictive analytic for Pharmacogenomics, Metabolomics and Proteomics
Big Data in downstream Genomics process
Application of Big data predictive analytics in Public health
Big Data Dashboard for quick accessibility of diverse data and display :
Integration of existing application platform with Big Data Dashboard
Big Data management
Case Study of Big Data Dashboard: Tableau and Pentaho
Use Big Data app to push location based services in Govt.
Tracking system and management
Day-5 : Session-1: How to justify Big Data BI implementation within an organization:
Defining ROI for Big Data implementation
Case studies for saving Analyst Time for collection and preparation of Data –increase in productivity gain
Case studies of revenue gain from saving the licensed database cost
Revenue gain from location based services
Saving from fraud prevention
An integrated spreadsheet approach to calculate approx. expense vs. Revenue gain/savings from Big Data implementation.
Day-5 : Session-2: Step by Step procedure to replace legacy data system to Big Data System:
Understanding practical Big Data Migration Roadmap
What are the important information needed before architecting a Big Data implementation
What are the different ways of calculating volume, velocity, variety and veracity of data
How to estimate data growth
Case studies
Day-5: Session 4: Review of Big Data Vendors and review of their products. Q/A session:
Accenture
APTEAN (Formerly CDC Software)
Cisco Systems
Cloudera
Dell
EMC
GoodData Corporation
Guavus
Hitachi Data Systems
Hortonworks
HP
IBM
Informatica
Intel
Jaspersoft
Microsoft
MongoDB (Formerly 10Gen)
MU Sigma
Netapp
Opera Solutions
Oracle
Pentaho
Platfora
Qliktech
Quantum
Rackspace
Revolution Analytics
Salesforce
SAP
SAS Institute
Sisense
Software AG/Terracotta
Soft10 Automation
Splunk
Sqrrl
Supermicro
Tableau Software
Teradata
Think Big Analytics
Tidemark Systems
Treeminer
VMware (Part of EMC)

主站蜘蛛池模板: 洗地机_扫地机_扫地车品牌_尘推车_工业吸尘器_山东鼎洁盛世 | 离子交换树脂_阴离子交换树脂_阳离子交换树脂-中国树脂网 | 长兴嘉诚炉业有限公司【官网】 | 泊头市鸿海泵业有限公司--导热油泵,高温油泵,沥青保温泵,圆弧泵,齿轮油泵,高粘度泵,自吸离心油泵,罗茨油泵为主的专业生产厂家 | 深圳社区邦家政公司-保姆_月嫂_育儿嫂 _早教育婴师_家政钟点工_家政培训_家政加盟- 社区邦优质生活服务 专业、标准化、便捷、安心,一站式家庭服务平台,服务专业有保障 ! | 昆明集装箱-云南住人集装箱活动房厂家|移动板房出租赁定制 | 儒亚科技_磁悬浮天平,竞争吸附,高压热重,重量法高压,高压密度 | 展馆展厅设计_数字多媒体展厅_3D全息投影_三维动画制作_企业宣传片|深圳市华南数字科技有限公司 斩天手游网_高质量手机游戏下载中心 | 聚焦吉林-城市晚报官方网站 | 尼德克医疗器械贸易(上海)有限公司| 造型松|造型黑松|油松|泰山松-莱芜市盛世园林苗木专业合作社 | 南通市通州区锦标建材有限公司-排水板,塑料排水板,植草格厂家 | 名嘉宴会【官网】_宁波冷餐_宁波茶歇_宁波酒会_宁波自助餐_宁波盛世名嘉宴会服务有限公司 | 佳龙食品集团|高端辣条领导品牌| 易居房产律师网|北京房产律师|房产纠纷律师|房产律师 | 江苏美鑫食品科技有限公司| 云南万通汽车学校【官方网站】 | 无尘车间_洁净车间_净化车间_洁净室工程一站式净化服务商-深圳市美克威尔环境科技有限公司 | 深圳市桃子自动化科技有限公司-点胶机_灌胶机_焊锡机_螺丝机_SCARA机器人 | 无尘布_乳胶手套_防静电手环_口罩-苏州迈思德超净科技有限公司 | 万向轴承_福来轮_全向轮_双向转轮_万向球_算盘轮_塑料卷轴-宁津县正彤机械塑料有限公司 | 生物可降解膜_全降解薄膜_可降解包装膜材料厂家-凯峰降解膜 | 喷涂机器人|自动喷涂生产线|自动喷涂设备|自动化生产线-深圳市荣德机器人科技有限公司 | 无锡纯铁-中纯特钢纯铁公司 | 丝杆升降机-蜗轮丝杆升降机-电动推杆-德州市金宇机械有限公司 | 潍坊沃林机械设备有限公司-牵引式风送果园打药机,悬挂式风送果园喷雾机,自走式果树喷药机,车载式风送远程喷雾机-潍坊沃林机械设备有限公司-牵引式风送果园打药机,悬挂式风送果园喷雾机,自走式果树喷药机,车载式风送远程喷雾机 潍坊网络推广,临沂360推广,东营360推广,枣庄360推广,潍坊网站建设,潍坊网络公司,潍坊360搜索,潍坊APP开发,潍坊360推广,潍坊360代理,潍坊点睛网络科技有限公司 | 自建房外墙砖|地砖|墙砖,农村|别墅瓷砖-佛山燊陶丰 | 上海安防网-上海安全防范报警协会| 汽车标签|医疗标签|电子标签|手机电池标签|电脑电池标签|电源标签|耐高温标签|防静电标签|手机出厂膜|手机全裹膜|手机包裹膜|手机卖点膜|热转印标签|遮阳板标签|天势科技|-标签印制专家! | 冷却塔厂家_冷却塔降噪维修_闭式冷却塔维修改造厂家-广东特菱空调 | 江苏广分检测技术有限公司、电力安全工具检测、苏州绝缘工具检测、昆山电力安全工具检测-广分检测技术(苏州)有限公司 | 欧美日韩人妻精品一区二区三区_欧美成人精品欧美一级乱黄_亚洲欧美日韩高清一区二区三区_国产一级做a爰片久久毛片_日韩一级视频在线观看播放_精品一区二区三区免费毛片爱_完整观看高清秒播国内外精品资源 | 机器人码垛机-纸箱套袋机-水平缠绕机-贴标机-开箱机-装盒机-杭州贝立智能设备有限公司 | 湖州网站建设_湖州网络公司_湖州后普网络科技有限公司 | 柱塞泥浆泵|压滤机专用泵|陶瓷泥浆泵_咸阳华星泵业有限公司 | 真空烘箱-高低温试验箱-防爆烘箱-防爆高低温试验箱-老化房-恒温恒湿箱-高低温试验箱-高低温冲击试验箱厂家—上海标承实验仪器有限公司 | 首页-西安汉沣精密机械有限公司 首页-上海钢之杰智能建筑集团股份有限公司 | 河南专升本-河南省统招专升本| 搅拌器「厂家直销」-淄博亿贝化工设备有限公司 | 新2025澳门天天开好彩生肖对照表,2025新澳精准正版免费,2025新澳今晚开奖资料大全,新澳门四肖期期准免费,新澳门今晚9点30分开奖结果 | 雷达液位计_耐磨热电偶_蒸汽_柴油,汽油_天然气流量计_巴歇尔槽_一体化温度变送器-江苏翔腾仪表有限公司 |