Tutorials - Big Data Archives | Big Data & Java Success

Blog Archives

1 2 3 4 5 … 13 »

00d: Getting started with Python on Mac OS

Python is popular in Big Data & data science projects. This tutorial outlines the basic steps to get started with Python on Mac OS.

1. Install Xcode

Xcode can be installed via Apple appstore. Xcode is Apple’s Integrated Development Environment (IDE). Xcode is a large suite of software development tools and libraries from Apple.

…

01: Getting started with Zookeeper tutorial

Installing Zookeepr on Windows

Step 1: Download Zookeeper from http://zookeeper.apache.org/. At the time of writing downloading zookeeper-3.4.11.tar.gz.

Step 2: Using 7-zip on windows unpack the gzipped tar file into a folder. E.g. c:\development\zookeeper-3.4.11. you can see “zkServer.cmd” in the bin folder for windows &

…

01: Apache Flume with JMS source (Websphere MQ) and HDFS sink

Apache Flume is used in the Hadoop ecosystem for ingesting data. In this example, let’s ingest data from Websphere MQ. Step 1: Apache flume is config driven. Hierarchy driven flume config flumeWebsphereMQQueue.conf file. You need to define the “source“, “ … Read more ›...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs

Already a member? Log in here

01: Apache Hadoop HDFS Tutorial

Step 1: Download the latest version of “Apache Hadoop common” from http://apache.claz.org/hadoop using wget, curl or a browser. This tutorial uses “http://apache.claz.org/hadoop/core/hadoop-2.7.1/”.

Step 2: You can set Hadoop environment variables by appending the following commands to ~/.bashrc file.


export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home
export M3_HOME=~/tools/apache-maven-3.3.9
export HADOOP_HOME=~/hadoop-eco/hadoop-2.7.1
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$M3_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_INSTALL=$HADOOP_HOME

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home

export M3_HOME=~/tools/apache-maven-3.3.9

export HADOOP_HOME=~/hadoop-eco/hadoop-2.7.1

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export PATH=$PATH:$M3_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export HADOOP_INSTALL=$HADOOP_HOME

You can run this in a Unix command prompt as


$ source ~/.bashrc

$ source ~/.bashrc

Step 3: You can verify if Hadoop has been setup properly with


$ hadoop version

$ hadoop version

Step 4: The Hadoop file in $HADOOP_HOME/etc/Hadoop/hadoop-env.sh has the JAVA_HOME setting.

…

01: Apache Kafka example with Java – getting started tutorial

Apache Kafka with Java getting started tutorial demonstrates how quickly you can get started with Kafka using Docker.

Step 1: Make sure Docker engine is installed on your computer. For example on a Mac OS $ brew cask install docker or on Windows.

…

01: Databricks getting started – PySpark, Shell, and SQL

Step 1: Signup to Databricks community edition – https://databricks.com/try-databricks. Fill in the details and you can leave your mobile number blank. Select “COMMUNITY EDITION” ==“GET STARTED“.

If you have a Cloud account then you can use it.

…

01: Docker tutorial with Java & Maven

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

Step 1: Create a Java project “docker-test” with “HelloDocker.java” file under “src/main/java”

…

01: Getting started with Apache Kafka on Mac Tutorial

We have already seen how to run Apache Kafka on a Docker container via a docker-compose.yaml file at Apache Kafka example with Java – getting started tutorial. Now let’s look at how easy it is to install Apache Kafka on Mac.

Prerequisite This tutorial assumes that Java 17 is installed.

…

01: Installing & getting started with Apache Storm on Cloudera quickstart

Step 1: Download latest version of Storm (E.g. apache-storm-1.1.1.tar.gz) from http://storm.apache.org/downloads.html On Cloudera machine it will be downloaded to the folder “/home/cloudera/Downloads”. Step 2: Create a directory named “/opt/storm” and unzip it after moving into that directory.


bash-4.1$ cd /home/cloudera/Downloads
bash-4.1$ sudo mkdir /opt/storm
bash-4.1$ sudo mv /home/cloudera/Downloads/apache-storm-1.1.1.tar.gz  /opt/storm
bash-4.1$ cd /opt/storm
bash-4.1$ sudo gunzip apache-storm-1.1.1.tar.gz
bash-4.1$ sudo tar -xvf apache-storm-1.1.1.tar

bash-4.1$ cd /home/cloudera/Downloads

bash-4.1$ sudo mkdir /opt/storm

bash-4.1$ sudo mv /home/cloudera/Downloads/apache-storm-1.1.1.tar.gz /opt/storm

bash-4.1$ cd /opt/storm

bash-4.1$ sudo gunzip apache-storm-1.1.1.tar.gz

bash-4.1$ sudo tar -xvf apache-storm-1.1.1.tar

Step 3: Start the services “Nimbus”, … Read more ›...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs

Already a member? Log in here

01: Learn Hadoop API by examples in Java

These Hadoop tutorials assume that you have installed Cloudera QuickStart, which has the Hadoop eco system like HDFS, Spark, Hive, HBase, YARN, etc.

What is Hadoop & HDFS? Hadoop based data hub architecture & basics | Hadoop eco system basics Q&As style.

…

01: Spark RDD joins in Scala tutorial

This tutorial extends Setting up Spark and Scala with Maven.

Step 1: Let’s take a simple example of joining a student to department. This will be written in an SQL world as:


SELECT s.name, d.name 
FROM Student s
JOIN Department d on s.deptId = d.id

SELECT s.name, d.name

FROM Student s

JOIN Department d on s.deptId = d.id

Step 2: Let’s create classes to represent Student and Department data.

…

01: Spark tutorial- writing a file from a local file system to HDFS

This tutorial assumes that you have set up Cloudera as per “cloudera quickstart vm tutorial installation” YouTube videos that you can search Google or YouTube. You can install it on VMWare (non commercial use) or on VirtualBox. I am using VMWare. Cloudera requires at least 8GB RAM and 16GB is...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs

Already a member? Log in here

01. Setting up Scala & practicing the concepts via REPL the Scala way for Java developers

Scala runs on the JVM, so Java and Scala stacks can be freely mixed. You can call Java libraries from Scala. Having said this, it is very important that you learn to write code the Scala way, and Not Java way. Currently, Java programmers with Scala experience are paid more and in demand.

…

1 2 3 4 5 … 13 »

Categories

Blog Archives

00d: Getting started with Python on Mac OS

01: Getting started with Zookeeper tutorial

01: Apache Flume with JMS source (Websphere MQ) and HDFS sink

01: Apache Hadoop HDFS Tutorial

01: Apache Kafka example with Java – getting started tutorial

01: Databricks getting started – PySpark, Shell, and SQL

01: Docker tutorial with Java & Maven

01: Getting started with Apache Kafka on Mac Tutorial

01: Installing & getting started with Apache Storm on Cloudera quickstart

01: Learn Hadoop API by examples in Java

01: Spark RDD joins in Scala tutorial

01: Spark tutorial- writing a file from a local file system to HDFS

01. Setting up Scala & practicing the concepts via REPL the Scala way for Java developers

300+ Java Interview FAQs

300 + Big Data Interview FAQs

16+ Java Tech Key Areas

10+ Companion Techs Q&As

300+ Java Interview Q&As

Tutorials on Java & Big Data

50+ Free Java & Big Data Interview Q&As

Disclaimer