How should an enterprise-level data warehouse be built? Why build an enterprise data warehouse? For most enterprises, data generally exists in two places, one is the business database, and the other is the log. In general, database data has a limited capacity, and records that are deleted for historical tags are generally cleaned regularly, but these data are often valuable. Database computing power is also limited, and if you do some data analysis, you will waste valuable computing resources.
Some data analysis will span different departments, different business lines, often need different DBs, and even need to do some correlation with the log, then there will be a new department, data warehouse department or data analysis department. The first thing that this department needs to do is to collect data from different lines of business into one center. In the past, the choice of data processing technology was often a commercial data warehouse. After the advent of Hadoop technology, it has been used by more and more companies due to its ease of use, high scalability, and low cost. This article will briefly introduce the use of Alibaba Cloud open source big data ecosystem E-MapReduce to build a data warehouse.
Building a data warehouse
The approximate architecture is shown below:
· In the RDS mysql part of the cloud database, you can synchronize the full amount of data to offline storage every night, using emapreduce sqoop to create partitions by date. When you query, you can follow
Select count(*) form cluster where ds='2016-08-28'
· Log data can be synchronized to the cloud storage OSS using logservice, or synced to emapreduce hdfs using Flume. It is also partitioned by date.
After the logs are collected, you can use the hive or spark engine to analyze the logs. For example, if you report the data, you can insert the calculated data into the emapreduce hbase or the cloud database RDS mysql, and then send the report through the quick bi provided by Alibaba Cloud. Every morning, you can see information about the business situation of the previous day.
Job execution
Synchronous and analytical jobs can be run using the execution plan provided by Alibaba Cloud empreduce. You can create a new execution plan, concatenate multiple jobs, and start the analysis job when the synchronization job is completed. It also provides useful functions such as job failure alarms and startup timeout alarms.
Hengstar PIM series is a range of touch screen monitors for industrial automation such as: water/power supply system, factory automation system and CNC machine etc. Its ruggedized structure and durable performance ensure the monitor can work stable at harsh environmental conditions. Monitor's front bezel is made of black powder coated aluminum. The front plate is IP65 grade. We provide vandal-proof protective glass or 5-wire resistive touchscreen for covering the front screen. Also we have various boards for different signal input solutions, and 12V/24V DC, 100-240V AC power input for different application requirements.
panel pc,industrial panel pc,industrial panel pc ip65,industrial touch panel pc,rack mount panel pc,rugged panel pc
Shenzhen Hengstar Technology Co., Ltd. , https://www.angeltondal.com