Blog Archives
1 2 3 4 5 13

00d: Getting started with Python on Mac OS

Python is popular in Big Data & data science projects. This tutorial outlines the basic steps to get started with Python on Mac OS.

1. Install Xcode

Xcode can be installed via Apple appstore. Xcode is Apple’s Integrated Development Environment (IDE). Xcode is a large suite of software development tools and libraries from Apple.

Read more ›



01: Getting started with Zookeeper tutorial

Installing Zookeepr on Windows

Step 1: Download Zookeeper from http://zookeeper.apache.org/. At the time of writing downloading zookeeper-3.4.11.tar.gz.

Step 2: Using 7-zip on windows unpack the gzipped tar file into a folder. E.g. c:\development\zookeeper-3.4.11. you can see “zkServer.cmd” in the bin folder for windows &

Read more ›



01: Apache Flume with JMS source (Websphere MQ) and HDFS sink

Apache Flume is used in the Hadoop ecosystem for ingesting data. In this example, let’s ingest data from Websphere MQ. Step 1: Apache flume is config driven. Hierarchy driven flume config flumeWebsphereMQQueue.conf file. You need to define the “source“, “ … Read more ›...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs
Already a member? Log in here


01: Apache Hadoop HDFS Tutorial

Step 1: Download the latest version of “Apache Hadoop common” from http://apache.claz.org/hadoop using wget, curl or a browser. This tutorial uses “http://apache.claz.org/hadoop/core/hadoop-2.7.1/”.

Step 2: You can set Hadoop environment variables by appending the following commands to ~/.bashrc file.

You can run this in a Unix command prompt as

Step 3: You can verify if Hadoop has been setup properly with

Step 4: The Hadoop file in $HADOOP_HOME/etc/Hadoop/hadoop-env.sh has the JAVA_HOME setting.

Read more ›



01: Apache Kafka example with Java – getting started tutorial

Apache Kafka with Java getting started tutorial demonstrates how quickly you can get started with Kafka using Docker.

Step 1: Make sure Docker engine is installed on your computer. For example on a Mac OS $ brew cask install docker or on Windows.

Read more ›



01: Databricks getting started – PySpark, Shell, and SQL


Step 1:
Signup to Databricks community edition – https://databricks.com/try-databricks. Fill in the details and you can leave your mobile number blank. Select “COMMUNITY EDITION” ==“GET STARTED“.

If you have a Cloud account then you can use it.

Read more ›



01: Docker tutorial with Java & Maven

Pre-requisite: Docker is installed on your machine for Mac OS X (E.g. $ brew cask install docker) or Windows 10. Docker interview Q&As.

Step 1: Create a Java project “docker-test” with “HelloDocker.java” file under “src/main/java”

Read more ›



01: Getting started with Apache Kafka on Mac Tutorial

We have already seen how to run Apache Kafka on a Docker container via a docker-compose.yaml file at Apache Kafka example with Java – getting started tutorial. Now let’s look at how easy it is to install Apache Kafka on Mac.

Prerequisite This tutorial assumes that Java 17 is installed.

Read more ›



01: Installing & getting started with Apache Storm on Cloudera quickstart

Step 1: Download latest version of Storm (E.g. apache-storm-1.1.1.tar.gz) from http://storm.apache.org/downloads.html On Cloudera machine it will be downloaded to the folder “/home/cloudera/Downloads”. Step 2: Create a directory named “/opt/storm” and unzip it after moving into that directory.

Step 3: Start the services “Nimbus”, … Read more ›...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs
Already a member? Log in here


01: Learn Hadoop API by examples in Java

These Hadoop tutorials assume that you have installed Cloudera QuickStart, which has the Hadoop eco system like HDFS, Spark, Hive, HBase, YARN, etc.

What is Hadoop & HDFS? Hadoop based data hub architecture & basics | Hadoop eco system basics Q&As style.

Read more ›



01: Spark RDD joins in Scala tutorial

This tutorial extends Setting up Spark and Scala with Maven.

Step 1: Let’s take a simple example of joining a student to department. This will be written in an SQL world as:

Step 2: Let’s create classes to represent Student and Department data.

Read more ›



01: Spark tutorial- writing a file from a local file system to HDFS

This tutorial assumes that you have set up Cloudera as per “cloudera quickstart vm tutorial installation” YouTube videos that you can search Google or YouTube. You can install it on VMWare (non commercial use) or on VirtualBox. I am using VMWare. Cloudera requires at least 8GB RAM and 16GB is...

This content is for 100-Day-Full-Access, 200-Day-Full-Access, 365-Day-Full-Access, and 2-Year-Full-Access members only. Register 50+ Free Java FAQs 50+ Free Big Data FAQs
Already a member? Log in here


01. Setting up Scala & practicing the concepts via REPL the Scala way for Java developers

Scala runs on the JVM, so Java and Scala stacks can be freely mixed. You can call Java libraries from Scala. Having said this, it is very important that you learn to write code the Scala way, and Not Java way. Currently, Java programmers with Scala experience are paid more and in demand.

Read more ›



1 2 3 4 5 13

300+ Java Interview FAQs

Tutorials on Java & Big Data