IMCatalina-v1.0/README.md

---
language:
- en
base_model:
- microsoft/phi-4
pipeline_tag: text-generation
library_name: transformers
tags:
- phi
- fine-tuned
- full-finetune
- instruction-tuning
- text-generation
- recruitment
- resume-parsing
- job-description-generation
---

# IMCatalina-v1.0

## Model summary
**IMCatalina-v1.0** is a **fully fine-tuned** version of **Phi-4** specialized in **recruitment document processing**.

The model focuses exclusively on:
- Parsing unstructured CVs/resumes
- Converting CV content into structured formats (JSON / YAML)
- Generating professional job descriptions from structured inputs

This model was trained end-to-end (full fine-tuning) and **does not perform candidate scoring, ranking, or hiring decisions**.

---

## Intended use

### Primary use cases
- CV and resume parsing
- Structured CV normalization (JSON / YAML)
- Extraction of skills, roles, education, and experience
- Job description generation for recruitment platforms
- Preprocessing for ATS and HR systems

### Explicitly out-of-scope
- Candidate ranking or scoring
- Hiring recommendations
- Candidate–job matching
- Automated decision-making
- Psychological or behavioral inference

---

## Model details
- **Base model:** microsoft/phi-4
- **Model type:** Decoder-only causal language model
- **Architecture:** Transformer (Phi family)
- **Parameters:** ~14B
- **Context length:** up to 16k tokens
- **Languages:** English
- **Training type:** Full fine-tuning

---

## Training

### Training data
- **Domain:** Recruitment and HR documentation
- **Data type:** Synthetic and curated structured data
- **Formats:**
  - Instruction–response
  - Schema-constrained generation
- **Content includes:**
  - CVs and resumes
  - Job descriptions
  - Skills, roles, education, and experience fields
- **Data processing:**
  - Deduplication
  - Schema validation
  - Removal of malformed samples
  - Consistency and format checks

> No real personal data was intentionally included in the training datasets.
-												初始化项目，由ModelHub XC社区提供模型

Model: rmtlabs/IMCatalina-v1.0
Source: Original Platform

											
										
										
											2026-04-10 15:13:17 +08:00
+								---
 								language:
 								- en
 								base_model:
 								- microsoft/phi-4
 								pipeline_tag: text-generation
 								library_name: transformers
 								tags:
 								- phi
 								- fine-tuned
 								- full-finetune
 								- instruction-tuning
 								- text-generation
 								- recruitment
 								- resume-parsing
 								- job-description-generation
 								---
 								# IMCatalina-v1.0
 								## Model summary
 								**IMCatalina-v1.0** is a **fully fine-tuned** version of **Phi-4** specialized in **recruitment document processing**.
 								The model focuses exclusively on:
 								- Parsing unstructured CVs/resumes
 								- Converting CV content into structured formats (JSON / YAML)
 								- Generating professional job descriptions from structured inputs
 								This model was trained end-to-end (full fine-tuning) and **does not perform candidate scoring, ranking, or hiring decisions**.
 								---
 								## Intended use
 								### Primary use cases
 								- CV and resume parsing
 								- Structured CV normalization (JSON / YAML)
 								- Extraction of skills, roles, education, and experience
 								- Job description generation for recruitment platforms
 								- Preprocessing for ATS and HR systems
 								### Explicitly out-of-scope
 								- Candidate ranking or scoring
 								- Hiring recommendations
 								- Candidate–job matching
 								- Automated decision-making
 								- Psychological or behavioral inference
 								---
 								## Model details
 								- **Base model:** microsoft/phi-4
 								- **Model type:** Decoder-only causal language model
 								- **Architecture:** Transformer (Phi family)
 								- **Parameters:** ~14B
 								- **Context length:** up to 16k tokens
 								- **Languages:** English
 								- **Training type:** Full fine-tuning
 								---
 								## Training
 								### Training data
 								- **Domain:** Recruitment and HR documentation
 								- **Data type:** Synthetic and curated structured data
 								- **Formats:**
 								  - Instruction–response
 								  - Schema-constrained generation
 								- **Content includes:**
 								  - CVs and resumes
 								  - Job descriptions
 								  - Skills, roles, education, and experience fields
 								- **Data processing:**
 								  - Deduplication
 								  - Schema validation
 								  - Removal of malformed samples
 								  - Consistency and format checks
 								> No real personal data was intentionally included in the training datasets.