Big Data Platform Engineer



Costa Mesa, CA
Full Time
2y ago

Company Description

Ascend Sandbox is an integrated BigData and analytics platform. Our clients, some of the America’s largest financial institutions, use this platform to understand consumer credit behavior and build models for credit decisioning, marketing and account review purposes. We are looking for an expert platform engineer with in-depth knowledge on BigData analytics & public cloud platforms to help us run this peta-byte scale BigData platform and provide the best possible experience for our clients.

What You’ll Do Here

•  Responsible for continuous platform enhancements, upgrades, availability, reliability and security of the Ascend Sandbox platform.

•Provide end-to-end observability of our Ascend Sandbox platform.

• Responsible for resolving incidents reported by Sandbox users and take preventive actions.

•  Help Sandbox users with troubleshooting failed MapReduce/Hive/Spark applications.

•  Help Sandbox users to improve the performance and optimize their MapReduce/Hive/Spark applications.

•  Participate in follow-the-sun on-call rotation to address any emergency production incidents affecting the Sandbox platform.


What You'll Need To Succeed

Must Have skills:

• Deep understanding of Linux, networking fundamentals and security.

• Solid professional coding experience with at least one scripting language - Shell, Python etc.

• Experience working with AWS cloud platform and infrastructure.

• Experience managing large BigData clusters in production (at least one of -- Cloudera, Hortonworks, EMR)

•  Excellent knowledge and solid work experience providing observability for BigData platforms using tools like Prometheus, InfluxDB, Dynatrace, Grafana, Splunk etc.

•  Experience managing BigData clusters with compute decoupled from storage (Eg: S3) on public cloud platforms.

• Expert knowledge on Hadoop Distributed File System (HDFS) and Hadoop YARN.

•  Decent knowledge of various Hadoop file formats like ORC, Parquet, Avro etc.

• Deep understanding of Hive (Tez), Hive LLAP, Presto and Spark compute engines.

•  Ability to understand query plans and optimize performance for complex SQL queries on Hive and Spark.

•  Hands on experience supporting Spark with Python (PySpark) and R (SparklyR, SparkR) languages.

• Experience working with Data Analysts, Data Scientists and at least one of these related analytical applications like SAS, R-Studio, JupyterHub, H2O etc.



• Able to read and understand code (Java, Python, R, Scala), but expertise in at least one scripting language.

• Experience managing JVM based applications in production.

•  Excellent written and oral communication.


Nice to have skills:

•  Experience with workflow management tools like Airflow, Oozie etc.

•  Implementation history of Terraform, Packer, Ansible, Chef, Jenkins or any other similar tooling.

•  Prior working knowledge of Active Directory and Windows OS based VDI platforms like Citrix, AWS Workspaces etc.

•  Professional coding experience in at least one programming language, preferably Java.

•  Experience with other public cloud platforms like Azure and GCP is a bonus


Additional Information

All your information will be kept confidential according to EEO guidelines.

Experian is proud to be an Equal Opportunity and Affirmative Action employer. Our goal is to create a thriving, inclusive and diverse team where people love their work and love working together. We believe that diversity, equity and inclusion is essential to our purpose of creating a better tomorrow. We value the uniqueness of every individual and want you to bring your whole, authentic self to work. For us, this is The Power of YOU and it ensures that we live what we believe.

Apply for this job

Click on apply will take you to the actual job site or will open email app.

Click above box to copy link
Get exclusive remote work stories and fresh remote jobs, weekly 👇
View all remote jobs
Onkar By: Onkar