AI数据挖掘工程师
Responsibilities:
岗位职责:
1. Integrate data from multiple sources and design automated integration solutions.
从多个数据来源整合数据,设计并实施自动化整合方案。
2. Develop data cleaning pipelines for industrial business scenarios to ensure data quality and consistency.
根据工业化业务场景开发数据清洗链路,保证数据集成的质量和一致性。
3. Build datasets for machine learning model training and testing.
构建用于机器学习模型训练和测试的数据集。
4. Explore and analyze large-scale industrial datasets to identify issues and trends for application in manufacturing.
对大规模的业务数据集进行探索分析,识别其中的问题点和趋势,并应用于工业制造。
5. Assess data validity and propose solutions to improve data quality.
分析数据的有效性,提出改进数据质量的方案。
6. Participate in establishing data governance frameworks to provide a stable data foundation for analysis.
参与数据治理体系的建立,为分析提供稳定的数据基础。
7. Continuously monitor data sources and processing workflows to ensure system robustness and reliability.
持续监控数据源和处理流程,确保系统的健壮性和可靠性。
8. Write clear technical documentation based on business needs, recording data processing workflows and practices.
结合业务需求编写清晰的技术文档,记录数据处理流程和实践。
Qualifications:
任职要求:
1. Bachelor's degree or higher in Computer Science, Statistics, Data Science, or related fields for 5 years experience.
计算机科学、统计学、数据或相关领域的本科及以上学历,5年以上工作经验。
2. Proficiency in at least one programming language (e.g., Python, Java, Scala) and familiarity with common databases & copilot AI tools.
熟练掌握至少一种编程语言(如Python/Java/Scala等),熟悉常用的数据库,熟悉copilot等AI工具。
3. Strong SQL skills with the ability to write efficient queries.
具备良好的SQL技能,能够编写高效的查询语句。
4. Familiarity with common data storage solutions (e.g., TiDB, Hadoop, Spark).
熟悉常见的数据存储解决方案(如TiDB/Hadoop/Spark等)。
5. Ability to use ETL tools or develop ETL processes independently.
能够使用ETL工具或自行开发ETL流程。
6. Excellent cross-department communication skills and team collaboration spirit.
具备良好的跨部门沟通能力和团队合作精神。
7. Experience with machine learning, deep learning, or large model projects is preferred.
有机器学习或深度学习、大模型等项目经验者优先。
8. Experience handling large-scale industrial datasets is a plus.
有处理大规模的工业数据集的经验者优先考虑。