Hadoop

Blog posts about Hadoop

lite-log hadoop hdfs

Resolve Hadoop RemoteException - Name node is in safe mode

10 views   0 comments last modified about 8 days ago

In Safe Mode, the HDFS cluster is read-only. After completion of block replication maintenance activity, the name node leaves safe mode automatically. If you try to delete files in safe mode, the following exception may raise: org.apache.hadoop.ipc.RemoteException(org.apac...

View detail
hadoop hdfs parquet sqoop

Configure Sqoop in a Edge Node of Hadoop Cluster

52 views   0 comments last modified about 8 days ago

This page continues with the following documentation about configuring a Hadoop multi-nodes cluster via adding a new edge node to configure administration or client tools. ...

View detail
hadoop yarn

Configure YARN and MapReduce Resources in Hadoop Cluster

6 views   0 comments last modified about 8 days ago

When configuring YARN and MapReduce in Hadoop cluster, it is very important to configure the memory and virtual processors correctly. If the configurations are incorrect, the nodes may not be able to start properly and the applications may not be able to run successfully. For example...

View detail
hadoop yarn hdfs

Configure Hadoop 3.1.0 in a Multi Node Cluster

261 views   0 comments last modified about 8 days ago

Previously, I summarized the steps to install Hadoop in a single node Windows machine. Install Hadoop 3.0.0 in Windows (Single Node) In this page, I...

View detail
lite-log

Install Big Data Tools (Spark, Zeppelin, Hadoop) in Windows for Learning and Practice

134 views   2 comments last modified about 15 days ago

Are you a Windows/.NET developer and willing to learn big data concepts and tools in your Windows? If yes, you can follow the links below to install them in your PC. The installations are usually easier to do in Linux/UNIX but they are not difficult to implement in Windows either since the...

View detail
hadoop yarn hdfs

Default Ports Used by Hadoop Services (HDFS, MapReduce, YARN)

36 views   0 comments last modified about 22 days ago

This page summarizes the default ports used by Hadoop services. It is useful when configuring network interfaces in a cluster. Hadoop 3.1.0 HDFS The secondary namenode http/https server address and port. ...

View detail
sql server spark hdfs parquet sqoop

Load Data into HDFS from SQL Server via Sqoop

33 views   0 comments last modified about 28 days ago

This page shows how to import data from SQL Server into Hadoop via Apache Sqoop. Prerequisites Please follow the link below to install Sqoop in your machine if you don’t have one environment ready. ...

View detail
hadoop yarn hdfs

Install Hadoop 3.0.0 in Windows (Single Node)

927 views   8 comments last modified about 3 months ago

This page summarizes the steps to install Hadoop 3.0.0 in your Windows environment. Reference page: https://wiki.apache.org/hadoop/Hadoop2OnWindows ...

View detail
lite-log spark hdfs scala parquet

Write and Read Parquet Files in HDFS through Spark/Scala

171 views   0 comments last modified about 3 months ago

In my previous post, I demonstrated how to write and read parquet files in Spark/Scala. The parquet file destination is a local folder. Write and Read Parquet Files in Spark/Scala In this page...

View detail
zeppelin spark hadoop rdd

Read Text File from Hadoop in Zeppelin through Spark Context

131 views   0 comments last modified about 3 months ago

Background This page provides an example to load text file from HDFS through SparkContext in Zeppelin (sc). Reference The details about this method can be found at: SparkContext.textFile ...

View detail