Personal SkillSet
Data Processing
Representative Projects→Data Cleaning and Preprocessing、Data Integration、Data Storage and Management、Data Analysis and Modeling、 Data Governance and Security、Database Performance Tuning、Data Visualization Big Data Technologies、Cloud Computing and Containerization
Software Development
Representative Projects→Object-Oriented Programming、Software Development Methodologies、Version Control Systems、 Web Development、APIs and Web Services、Cloud Services、Software Testing Software Architecture and Design、Security Best Practices、IDEs、CI/CD
Technical Support
Representative Projects→24x7 On-call、Version Control Systems、CI/CD、Configuration Management、Infrastructure as Code Containerization、Container Orchestration、Cloud Services、Monitoring and Logging、 Basic Network Knoeledge、Script Programming、 Security and Compliance
Work Experience
Data Engineer | Vancouver
Defend
August 2023 – June 2024
Implemented ETL operations in a high-volume data environment, reducing data load time by 25%, responsible 100% of designing and documenting 30+ table schemas and all data quality across product verticals and related business areas.
- Communicated with business departments to understand needs in order to build 10+ data pipelines for analyzing technical issues.
- Formulated efficient data pipelines with Google Cloud Dataflow, facilitating quicker data transformation.
- Used Airflow to build ETL solutions that helped improve conversion rates by 16%.
- Developed and optimized complex SQL queries and designed index, improving query speed by 35%.
KEY TECHNOLOGIES: ELT, SQL, Python, Airflow, Kettle, PowerBI, Azure, Script
Data Engineer | Beijing
Beijing Rainer Technology
October 2019 – June 2021
Led a team of 8 data engineers in designing and implementing scalable data solutions to support various business functions. Discussed customer needs and documented their requirements for projects, ensuring all data solutions met customer expectations.
- Created data mapping documentation based on customer requirements.
- Designed and implemented scalable data models, increasing data query performance by 35%.
- Optimized existing databases, resulting in a 20% reduction in storage costs and improved system performance utilizing DB2.
- Used DB2 to write stored procedures to extract and transform data, enhancing data processing accuracy and efficiency.
- Designed and optimized ETL pipelines using Python and SQL, reducing data processing time by 30%.
- Automated data loading and reporting processes, decreasing manual effort by 80% and improving accuracy using Shell scripting.
- Led efforts to migrate local data infrastructure to Azure cloud services, enhancing scalability and reducing maintenance costs.
KEY TECHNOLOGIES: DB2, SQL server, SQL, Linux, Shell, Kettle, Tableau, AWS, Python, Logistic Regression, Linear Regression
BI Engineer | Beijing
Beijing Huaxia Diantong IT
June 2019 – October 2019
- Supported the development of ETL processes while updating the data mapping documentation as needed.
- Mastered ETL processes with Kettle and SQL, optimizing data flow from source databases to MySQL central data warehouse.
- Provided solutions and data support to on-site implementation engineers.
KEY TECHNOLOGIES: MySQL, Oracle, SQL, Kettle, Tableau, Java
Lead Data Engineer | Guilin
Jiade IT Solutions
January 2018 – May 2019
Led a team of 3 interns in developing and optimizing data engineering solutions, providing mentorship and guidance to new team members, fostering their professional growth and development in data engineering practices.
- Applied agile development to enhance project delivery efficiency, resulting in a 20% increase in productivity.
- Collaborated with cross-functional teams to identify and address data needs, resulting in improved data quality and accessibility.
- Utilized MySQL database for data storage and management, ensuring data integrity and reliability.
- Developed and maintained ETL processes, enhancing data extraction and transformation accuracy by 20% using Kettle.
- Implemented Airflow for scheduling and orchestrating complex data workflows, improving data processing automation.
- Reduced database downtime by 30% by implementing a robust backup and recovery system utilizing MySQL and Azure services.
KEY TECHNOLOGIES: MySQL, Oracle, SQL, Kettle, Tableau, Java
Education
University of Nottingham Nottingham, UK
Master of Science, major in Artificial Intelligence
- Graduation with Merit
- Core Modules: Machine Learning, Advanced Data Structure and Algorithms, Data Modeling and Analysis, etc.
- Graduation thesis: Topic Modeling on Google Retrieval History for User Experience
Guangxi Normal University Guilin, China
Bachelor of Science, major in Computer Science
- GPA: 3.53/4.0 (85.3% - Top 3)
- Core Modules: Object-Oriented Programming, Advanced Java Software Development, Computer Network, Operation System, Big Data, Web Design, Database, Linear Algebra and Discrete Mathematics, etc.
- Graduation thesis: Conversational Chatbot based on DuerOS Core