Falcon IT and Staffing Solutions

Junior Data Engineer

Washington, D.C., District of Columbia, United StatesFulltimePosted about 1 month ago

About the role

Seeking a Junior Data Engineer to support the Department of Transportation. The focus of this role is supporting the modernization of legacy Informatica-based ETL pipelines into Databricks using PySpark and Spark SQL. This position supports data migration and modernization efforts within a data-heavy, potentially regulated environment. Candidates will work closely with senior engineers, data architects, and QA teams during iterative migration cycles.

Job Responsibilities

Analyze Informatica workflows and mappings to understand source-to-target logic, transformations, dependencies, and scheduling order.
Convert Informatica mappings into Databricks pipelines using PySpark and Spark SQL.
Implement data ingestion from on-prem and cloud sources into Databricks using the medallion architecture (landing → bronze → silver).
Adapt existing ETL logic to align with a new enterprise data model, identifying gaps and required transformation changes.
Support unit testing, reconciliation, and data validation between legacy and modern pipelines.
Validate row counts, aggregates, and business rules to ensure data accuracy and consistency.
Document migration logic, assumptions, and deviations from legacy behavior.
Collaborate with senior engineers, data architects, and QA teams throughout iterative migration cycles.

Required skills

Data engineeringETL developmentData integrationInformatica PowerCenterCommon transformationsDatabricksSQLETL / ELTPublic TrustData Modeling

Preferred skills

AWSDelta Lake

Education requirements

Degree

Bachelor

Major

Computer Science

Job Requirements

2–3 years of experience in data engineering, ETL development, or data integration.
Working knowledge of Informatica PowerCenter, including: Mappings, workflows, and sessions. Common transformations (Source Qualifier, Expression, Lookup, Joiner, Aggregator, Router, Filter).
Basic to intermediate experience with Databricks, including: PySpark, Spark SQL, Notebooks and jobs.
Strong SQL fundamentals, including joins, aggregations, and window functions.
Solid understanding of ETL / ELT concepts, data warehousing principles, and batch processing.
Strong attention to detail and analytical skills.
Ability to clearly document technical logic and communicate findings to technical team members.
Data Modeling and Analysis Skills:
Interpret legacy data models and mapping documentation.
Identify how legacy fields map (or do not map) to a new target data model.
Flag missing logic, derived fields, or transformation gaps early in the migration process.
Perform detailed data validation, including reconciliation of row counts and aggregates.
Prior experience in regulated or data-heavy environments (finance, government, healthcare) is preferred.
Exposure to Informatica-to-Databricks migrations or similar data modernization efforts is preferred.
Familiarity with Delta Lake and medallion architecture (Bronze / Silver / Gold) is preferred.
Basic understanding of AWS, including S3 and IAM concepts.
Experience reading or generating code from Informatica XML exports.
Bachelor’s Degree in Computer Science, Engineering, or a related field (or equivalent experience).
Experience supporting report rationalization and data migration initiatives.
Clearance: US Citizen eligible to obtain DOT Public Trust.

Ready to apply?

Submit your application and we'll get back to you within a few days.

Job details

Location

United States, District of Columbia, Washington, D.C.

Employment Type

Fulltime

Location Type

Hybrid

Compensation

$70,000 - $90,000 /hour