81 lines
1.9 KiB
Markdown
81 lines
1.9 KiB
Markdown
|
|
---
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
base_model:
|
|||
|
|
- microsoft/phi-4
|
|||
|
|
pipeline_tag: text-generation
|
|||
|
|
library_name: transformers
|
|||
|
|
tags:
|
|||
|
|
- phi
|
|||
|
|
- fine-tuned
|
|||
|
|
- full-finetune
|
|||
|
|
- instruction-tuning
|
|||
|
|
- text-generation
|
|||
|
|
- recruitment
|
|||
|
|
- resume-parsing
|
|||
|
|
- job-description-generation
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# IMCatalina-v1.0
|
|||
|
|
|
|||
|
|
## Model summary
|
|||
|
|
**IMCatalina-v1.0** is a **fully fine-tuned** version of **Phi-4** specialized in **recruitment document processing**.
|
|||
|
|
|
|||
|
|
The model focuses exclusively on:
|
|||
|
|
- Parsing unstructured CVs/resumes
|
|||
|
|
- Converting CV content into structured formats (JSON / YAML)
|
|||
|
|
- Generating professional job descriptions from structured inputs
|
|||
|
|
|
|||
|
|
This model was trained end-to-end (full fine-tuning) and **does not perform candidate scoring, ranking, or hiring decisions**.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Intended use
|
|||
|
|
|
|||
|
|
### Primary use cases
|
|||
|
|
- CV and resume parsing
|
|||
|
|
- Structured CV normalization (JSON / YAML)
|
|||
|
|
- Extraction of skills, roles, education, and experience
|
|||
|
|
- Job description generation for recruitment platforms
|
|||
|
|
- Preprocessing for ATS and HR systems
|
|||
|
|
|
|||
|
|
### Explicitly out-of-scope
|
|||
|
|
- Candidate ranking or scoring
|
|||
|
|
- Hiring recommendations
|
|||
|
|
- Candidate–job matching
|
|||
|
|
- Automated decision-making
|
|||
|
|
- Psychological or behavioral inference
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Model details
|
|||
|
|
- **Base model:** microsoft/phi-4
|
|||
|
|
- **Model type:** Decoder-only causal language model
|
|||
|
|
- **Architecture:** Transformer (Phi family)
|
|||
|
|
- **Parameters:** ~14B
|
|||
|
|
- **Context length:** up to 16k tokens
|
|||
|
|
- **Languages:** English
|
|||
|
|
- **Training type:** Full fine-tuning
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Training
|
|||
|
|
|
|||
|
|
### Training data
|
|||
|
|
- **Domain:** Recruitment and HR documentation
|
|||
|
|
- **Data type:** Synthetic and curated structured data
|
|||
|
|
- **Formats:**
|
|||
|
|
- Instruction–response
|
|||
|
|
- Schema-constrained generation
|
|||
|
|
- **Content includes:**
|
|||
|
|
- CVs and resumes
|
|||
|
|
- Job descriptions
|
|||
|
|
- Skills, roles, education, and experience fields
|
|||
|
|
- **Data processing:**
|
|||
|
|
- Deduplication
|
|||
|
|
- Schema validation
|
|||
|
|
- Removal of malformed samples
|
|||
|
|
- Consistency and format checks
|
|||
|
|
|
|||
|
|
> No real personal data was intentionally included in the training datasets.
|