Cloudera Hadoop Administrator

Ventois is always looking for talented people to become its team members. We realize that productive people are always a good addition to our organization. So if you have the financial acumen and the passion to work with some of the biggest companies in the world, join in!

Project Location(s): Work location is Shrewsbury, MA with required travel to client locations throughout the USA.
Education: A minimum of bachelor’s degree in computer science, computer information systems, information technology, or a closely related field, or a combination of education and experience equating to the U.S equivalent of a bachelor’s degree in one of the aforementioned subjects.

Responsibilities

  • Leverage the CDP features to build the cloud-hybrid infrastructure for CDP (CDP Public Cloud).
  • Independently install and maintain Big Data (Cloudera) clusters in high available, load balanced configuration across multiple (Production, QA and Development) environments in both on-prem and Cloud (AWS) environments.
  • Implement the Knox and Kerberos for Cluster security and integrate with enterprise and cloud IAM.
  • Develop scripts to automate and streamline operations and configuration
  • Manage and automate the installation process (use tools like Ansible) for CDP Manager, CDH, and the ecosystem projects. Activities include: Set up a local CDH repository; Perform OS-level configuration for Hadoop installation; Install Cloudera Manager server and agents; Install CDH using Cloudera Manager; Add a new node to an existing cluster; Add a service using Cloudera Manager
  • Schedule the jobs using Apache Nifi or Air flow
  • Analyze, recommend and implement improvements to support environment/infrastructure management initiatives. Configure – Perform basic and advanced configuration needed to effectively administer a Hadoop cluster on-prem and on-cloud. Activities include: Configure a service using Cloudera Manager; Create an HDFS user’s home directory; Configure Name Node HA; Configure Resource Manager HA; Configure proxy for Hiveserver2/Impala
  • Maintain and modify the cluster to support day-to-day operations in the enterprise. Activities include: Rebalance the cluster; Set up alerting for excessive disk fill; Define and install a rack topology script; Install new type of I/O compression library in cluster; Revise YARN resource assignment based on user feedback; Commission/decommission a node
  • Enable relevant services and configure the cluster to meet goals defined by security policies, Activities include: Configure HDFS ACLs; Install and configure Sentry; Configure Hue user authorization and authentication; Enable/configure log and query redaction; Create encrypted zones in HDFS
  • Benchmark the cluster operational metrics, test system configuration for operation and efficiency. Activities include: Efficiently copy data within a cluster/between clusters; Create/restore a snapshot of an HDFS directory; Get/set ACLs for a file or directory structure; Benchmark the cluster (I/O, CPU, network)
  • Under general supervision, manage Big Data Administration activities, technical documentation, system performance support, and internal customer support. May provide input into the development of Systems Architecture for mission critical corporate development projects.
  • Research performance issues, configuring the cluster with Cloudera best practices, optimizing specifications and parameters to fine-tune and proactively avoid performance issues.

Skills/Experience

Great interpersonal communication skills;
A keen eye for spotting data trends;
Great analytical skills;
A keen grasp of information technology;
Professional demeanor;
Personal accountability and strong work ethic;
Professional, able to interact with vendors/clients;
Positive, “can-do” attitude.

This website uses cookies and asks your personal data to enhance your browsing experience.