(A) Steps to Access Head Node at WestGrid to Start PBS Job |
(1) qsub -I -l walltime = 72:00:00, nodes = 6: ppn = 12, mem = 132 gb |
(2) ll/global/software/Hadoop-cluster/-ltr |
hdp 2.6.2, hb 0.98.16.1, phoenix 4.6.0 |
(3) module load Hadoop/2.6.2 |
(4) setup_start-Hadoop.sh f (f for format; do this only once…). |
(5) module load HBase/… |
(6) module load phoenix/… |
(7) (actually check the ingest.sh script under ~/bel_DAD) |
(8) hdfs dfsadmin -report |
(9) djps (command displays the JVMs, Java services running with PIDs) |
(B) Process to Ingest the File into Phoenix/HBase Database |
(1) module load Hadoop/2.6.2 |
(2) module load HBase/0.98.16.hdp262 |
(3) module load phoenix/4.6.0 |
(4) localFileName = “The CSV file containing your data” |
(5) hdfs dfs -mkdir/data |
(6) hdfs dfs -put “localFileName”/data/ |
(7) hdfs dfs -ls/data |
(8) sqlline.py hermes0090-ib0 DAD.sql |
(9) export HADOOP_CLASSPATH = /global/software/Hadoop-cluster/HBase-0.98.16.1/lib/HBase- |
protocol-0.98.16.1.jar:/global/software/Hadoop-cluster/HBase-0.98.16.1/lib/high-scale-lib- |
1.1.1.jar:/global/scratch/dchrimes/HBase-0.98.16.1/34434213.moab01.westgrid.uvic.ca/conf |
(10) time Hadoop jar/global/software/Hadoop-cluster/phoenix-4.6.0/phoenix-4.6.0-HBase-0.98-client.jar |
org.apache.phoenix.MapReduce.CsvBulkLoadTool --table DAD –input “/data/localFileName” |
#psql.py -t DAD localhost all.csv |
(C) Ingest All Using d_runAll.sh |
(1) First decide which file to use, then check the correctness of its column names. DADV2.sql (for v2) and |
DAD.sql (for old) |
(2) Create the database table using sqlline.py as illustrated above (sqlline.py hermes0090-ib0 DAD.sql) |
(3) Make sure all the modules loaded: module load Hadoop/2.6.2module load HBase/0.98.16. |
hdp262module load phoenix/4.6.0 |
(4) Generate the rest of data (we need 10 billion and monitor Big Data integer in the database). |
(5) Use the d_runAll.sh to ingest them all at once. |
(6) If a problems happen (persists) check the logs in different location (/global/scratch/dchrimes/and/or on |
the/scratch/JOBID on the nodes). |