81 lines
1.9 KiB
Markdown
81 lines
1.9 KiB
Markdown
---
|
||
language:
|
||
- en
|
||
base_model:
|
||
- microsoft/phi-4
|
||
pipeline_tag: text-generation
|
||
library_name: transformers
|
||
tags:
|
||
- phi
|
||
- fine-tuned
|
||
- full-finetune
|
||
- instruction-tuning
|
||
- text-generation
|
||
- recruitment
|
||
- resume-parsing
|
||
- job-description-generation
|
||
---
|
||
|
||
# IMCatalina-v1.0
|
||
|
||
## Model summary
|
||
**IMCatalina-v1.0** is a **fully fine-tuned** version of **Phi-4** specialized in **recruitment document processing**.
|
||
|
||
The model focuses exclusively on:
|
||
- Parsing unstructured CVs/resumes
|
||
- Converting CV content into structured formats (JSON / YAML)
|
||
- Generating professional job descriptions from structured inputs
|
||
|
||
This model was trained end-to-end (full fine-tuning) and **does not perform candidate scoring, ranking, or hiring decisions**.
|
||
|
||
---
|
||
|
||
## Intended use
|
||
|
||
### Primary use cases
|
||
- CV and resume parsing
|
||
- Structured CV normalization (JSON / YAML)
|
||
- Extraction of skills, roles, education, and experience
|
||
- Job description generation for recruitment platforms
|
||
- Preprocessing for ATS and HR systems
|
||
|
||
### Explicitly out-of-scope
|
||
- Candidate ranking or scoring
|
||
- Hiring recommendations
|
||
- Candidate–job matching
|
||
- Automated decision-making
|
||
- Psychological or behavioral inference
|
||
|
||
---
|
||
|
||
## Model details
|
||
- **Base model:** microsoft/phi-4
|
||
- **Model type:** Decoder-only causal language model
|
||
- **Architecture:** Transformer (Phi family)
|
||
- **Parameters:** ~14B
|
||
- **Context length:** up to 16k tokens
|
||
- **Languages:** English
|
||
- **Training type:** Full fine-tuning
|
||
|
||
---
|
||
|
||
## Training
|
||
|
||
### Training data
|
||
- **Domain:** Recruitment and HR documentation
|
||
- **Data type:** Synthetic and curated structured data
|
||
- **Formats:**
|
||
- Instruction–response
|
||
- Schema-constrained generation
|
||
- **Content includes:**
|
||
- CVs and resumes
|
||
- Job descriptions
|
||
- Skills, roles, education, and experience fields
|
||
- **Data processing:**
|
||
- Deduplication
|
||
- Schema validation
|
||
- Removal of malformed samples
|
||
- Consistency and format checks
|
||
|
||
> No real personal data was intentionally included in the training datasets.
|