Code Repositories

Find and share code repositories
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.
New Contributor
Repo Description
Skool covers the following aspects:
  • Seamless data transfer from Hadoop into a relational database
  • Seamless data transfer from a relational database into Hadoop
  • File transfer and Hive table creation for file based transfers into Hadoop
  • Automatic generation and deployment of file creation scripts and jobs from Hadoop or Hive tables

Key Features:

  • The tool generates code which can be automatically executed (or scheduled) for delta and milestone replication with defined frequency of data refresh.
  • The tool is configurable to select tables/columns/files which are to be transferred in or out of Hadoop.
  • Inbuilt optimization of storage to deliver performant code – the tool takes into consideration table size, database partitions, file formats and compression.

Benefits of using Skool:

1.All scripts are provided to user and are customizable as needed

2.Code consistency is maintained

3.Effective logger information while running the application

4.Audit/Lineage recorded at every action in Hive Table

5.Custom Housekeeping

6.Support for both AVRO and Text files

7.Data can be imported from an Oracle database as well as from a server pushing down files to Hadoop

8.Compression over the stored data

9.Tables created over stored data

10.Automatic partitions over tables

11.Support for both incremental and milestone data pulls

12.Customized job scheduling

Repo Info
Github Repo URL https://github.com/BT-Plc/skool
Github account name BT-Plc
Repo name skool
1,253 Views
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎06-13-2016 09:22 PM
Updated by:
Contributors