ModelHub XC 466afa54a8 初始化项目,由ModelHub XC社区提供模型
Model: millat/mistral-7b-indian-university-guidance
Source: Original Platform
2026-06-20 04:15:17 +08:00

license, language, library_name, tags, base_model, datasets, model-index, pipeline_tag
license language library_name tags base_model datasets model-index pipeline_tag
apache-2.0
en
transformers
unsloth
mistral
fine-tuned
education
guidance
bangladesh
india
university
lora
qlora
conversational
text-generation
doi:10.57967/hf/7639
mistralai/Mistral-7B-Instruct-v0.3
millat/indian_university_guidance_for_bangladeshi_students
name results
mistral-7b-indian-university-guidance
task metrics
type name
text-generation Text Generation
type value name
perplexity 1.4555 Perplexity
type value name
loss 0.3754 Evaluation Loss
text-generation

Mistral 7B - Indian University Guidance for Bangladeshi Students

A fine-tuned Mistral-7B model specialized for providing comprehensive guidance to Bangladeshi students seeking admission to Indian universities. The model delivers accurate, domain-specific information about scholarships, admissions, visa processes, and degree equivalencies.

This model is trained on the contextual dataset introduced in our research paper: "Development of a Contextual Educational Dataset for Bangladeshi Students Studying in India" presented at Sharda University, Greater Noida, India.

Model DOI 10.57967/hf/7639
Dataset DOI 10.57967/hf/6295

Authors

  • MD Millat Hosen
  • Md Moudud Ahmed Misil
  • Dr. Rohit Kumar Sachan

Model Highlights

  • Excellent Perplexity: 1.4555 (indicating high prediction confidence)
  • Domain-Specialized: Trained on 7,044 curated Q&A pairs
  • Practical Knowledge: Covers Sharda University scholarships, degree equivalencies, GPA conversions, and regulatory requirements (AICTE, NMC, BMDC)
  • Efficient Training: QLoRA fine-tuning on Tesla T4 (Google Colab Free Tier)

Model Description

This model is fine-tuned from mistralai/Mistral-7B-Instruct-v0.3 using the Unsloth framework with QLoRA (Quantized Low-Rank Adaptation). It provides specialized guidance covering:

Category Topics Covered
Universities Sharda University (primary focus), Galgotias, Amity, Noida International University (NIU)
Scholarships Eligibility criteria, percentage waivers (20%/50%), retention requirements, program exclusions
Admissions Requirements, documents, deadlines, lateral entry for diploma holders
Visa & Documentation Student visa process, FRRO registration, required documents
Degree Equivalence B.Sc. Engineering ↔ B.Tech, B.Com, BA, Polytechnic Diploma, Madrasa (Alim) recognition
GPA Conversion HSC GPA (out of 5) to Indian percentage/CGPA (out of 10) system
Medical Education MBBS requirements, NMC approval, BMDC recognition
Regulatory Bodies AICTE, NMC, BMDC, UGC requirements and approval processes

Note

: This model is primarily trained on Sharda University data and general degree equivalence information. It does not contain specific information about IITs, NITs, or other public universities.

Training Details

Configuration

Parameter Value
Base Model mistralai/Mistral-7B-Instruct-v0.3
Framework Unsloth + QLoRA
Quantization 4-bit (bnb)
LoRA Rank (r) 32
LoRA Alpha 32
LoRA Dropout 0
Max Sequence Length 512
Learning Rate 2e-4
Batch Size 4
Gradient Accumulation 4
Effective Batch Size 16
Epochs 3
Total Steps 1,191
Warmup Steps 10
Seed 3407

Training Results

Metric Value
Final Training Loss 0.1016
Evaluation Loss 0.3754
Perplexity 1.4555
Reported Training Duration 66.47 minutes*
Actual Wall-Clock Time ~3.5 hours**

*Timer reset on Colab reconnections; this is cumulative GPU compute time only
**Actual elapsed time from first epoch start (07:32) to final epoch completion (10:58) was approximately 3 hours 26 minutes due to Colab session interruptions and reconnections

Training Progress

Epoch Eval Loss Perplexity
1 0.3820 1.4652
2 0.3757 1.4560
3 0.3754 1.4555

Hardware

Resource Specification
GPU Tesla T4
GPU Memory (Total) 15.83 GB
GPU Memory (Peak) 11.95 GB
GPU Memory (Average) 5.19 GB
RAM (Peak) 5.95 GB

Dataset

Trained on millat/indian_university_guidance_for_bangladeshi_students - a custom dataset containing 7,044 high-quality, instruction-formatted Q&A pairs created by the same authors using the SetForge pipeline.

Property Value
Dataset DOI 10.57967/hf/6295
Format JSONL (JSON Lines)
License MIT
Total Samples 7,044
Training Samples 6,339 (90%)
Evaluation Samples 705 (10%)
Estimated Tokens ~1,019,372

Dataset Topics

Category Topics Covered
University Information Private NCR universities: Sharda (primary), Galgotias, Amity, Noida International University
Scholarships Eligibility criteria, percentage waivers (20%/50%), retention requirements, program exclusions
Admissions Requirements, documents, deadlines, lateral entry for diploma holders
Visa & Documentation Student visa process, e-FRRO registration, passport requirements
Degree Equivalence Bangladesh to India credential conversion (B.Sc. Engineering ↔ B.Tech, Polytechnic Diploma, Madrasa Alim)
GPA Conversion HSC GPA (out of 5) to Indian percentage/CGPA (out of 10) system
Regulatory Bodies AICTE, NMC, BMDC, UGC requirements and approval processes

Data Sources (from research paper)

Category Sources
Universities Sharda University (4 portals), Galgotias University (3 portals), Amity University (3 portals), Noida International University (2 portals)
Government - Bangladesh High Commission New Delhi, Deputy High Commission Chennai, e-Passport Portal
Government - India High Commission Dhaka, e-FRRO, Indian Visa portals, MEA, MHA, Study in India portal

⚠️ Note: This dataset does NOT include IITs, NITs, Central Universities, or other public/government institutions. It focuses exclusively on private universities in India's National Capital Region (NCR).

Data Fields

Each record in the dataset contains:

Field Description
question Question from the perspective of a Bangladeshi student
answer Comprehensive, direct answer to the question
context Brief description of the topic the Q&A belongs to
source Source of information used to generate the answer
metadata Nested object with degree_equivalence, grading_conversion, country_origin, tone, cultural_sensitivity

Data Statistics

Metric Questions Answers
Min Characters 18 65
Max Characters 307 1,151
Average Characters 128.5 391.7
Average Tokens 38.9 105.8

Data Curation Pipeline

The dataset was created through the SetForge pipeline (as described in our research paper):

  1. Source Selection: Identification of relevant sources from NCR private universities and government portals
  2. Data Extraction: Content collected using WebScrape Chrome extension with support for dynamic content and PDF extraction via pdf.js
  3. Data Preprocessing: NLP and RegEx-based cleaning, filtering, duplicate removal, and domain categorization
  4. Data Categorization & Standardization: AI-powered document triage using LLMs for semantic chunking and schema-guided structuring with resilient error handling
  5. Dataset Generation: Context-aware Q&A pair generation with instruction-based formatting using asyncio for concurrent processing, stored in JSONL format

Usage

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="millat/mistral-7b-indian-university-guidance",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

prompt = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
What scholarships are available at Sharda University for Bangladeshi students?

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "millat/mistral-7b-indian-university-guidance"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Can I get lateral entry to B.Tech 2nd year with my Polytechnic Diploma from Bangladesh?

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With GGUF (llama.cpp)

# Download the Q8_0 quantized version
huggingface-cli download millat/mistral-7b-indian-university-guidance \
    --include "*.gguf" \
    --local-dir ./models

# Run with llama.cpp
./main -m ./models/mistral-7b-indian-university-guidance-q8_0.gguf \
    -p "What are the requirements for MBBS admission in India for Bangladeshi students?" \
    -n 256

With Ollama

# Create Modelfile
cat << 'EOF' > Modelfile
FROM ./mistral-7b-indian-university-guidance-q8_0.gguf
TEMPLATE """Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{{ .Prompt }}

### Response:
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF

# Create and run
ollama create indian-uni-guide -f Modelfile
ollama run indian-uni-guide "What is the FRRO registration process?"

Example Outputs

Scholarship Information

Question: "I have a GPA of 3.8 out of 5.0 in my HSC from Bangladesh. What scholarship can I get at Sharda University?"

Response: "With a GPA of 3.8/5, you are eligible for a 50% tuition fee scholarship at Sharda University for programs like B.Tech, BBA, MBA, BCA, and others listed in the 50% scholarship category. This is because your GPA falls within the 3.5-5.0 range. Exclusions from scholarships include Pharmacy, M.Sc. Nursing, MPT, Medical M.Sc., BDS & MBBS. To retain this scholarship after first year, maintain 75% attendance and pass all exams without backlogs."

Degree Equivalency

Question: "I have a B.Sc. in Engineering from Bangladesh. Is it equivalent to B.Tech in India?"

Response: "Yes, your 4-year B.Sc. in engineering (Bangladeshi university curriculum) is considered equivalent to an Indian B.Tech or B.E. degree. The key requirement for recognition in Bangladesh is that the Indian engineering program must be AICTE-approved."

Medical Degree Recognition

Question: "Will my MBBS degree from India be recognized by BMDC in Bangladesh?"

Response: "For your MBBS or BDS degree from an Indian university to be recognized and valid for practice in Bangladesh, it must be NMC-approved (National Medical Commission) in India and subsequently recognized by the BMDC (Bangladesh Medical and Dental Council). Ensure that the program is full-time and on campus. Recognition typically requires completion of specific regulatory requirements by both parties."

Lateral Entry

Question: "Can I get lateral entry to B.Tech 2nd year with my Polytechnic Diploma from Bangladesh?"

Response: "Yes, your 4-year Polytechnic Diploma from Bangladesh (completed after SSC) makes you eligible for lateral entry into the second year of B.Tech programs at Indian universities like Sharda University. The Indian B.Tech degree is equivalent to B.Sc. in Engineering in Bangladesh. Ensure the Indian institution is AICTE-approved for your degree to be recognized back in Bangladesh."

FRRO Registration

Question: "What documents do I need for FRRO registration in India as a Bangladeshi student?"

Response: "You will need: original passport with visa, bonafide certificate from university/college, CPR certificate (compulsory pregnancy report if applicable), marriage certificate (if married), medical certificate (AIDS test result) and 4 photographs."

Model Comparison

The fine-tuned model shows significant improvements over the base model:

Aspect Base Model Fine-tuned Model
Domain Knowledge Generic Specialized for BD→India education
Response Length Verbose, often off-topic Concise and focused
Accuracy Generic advice Specific criteria, percentages, requirements
Hallucinations Common (fake URLs, HTML artifacts) Minimal
Regulatory Knowledge Limited Accurate (AICTE, NMC, BMDC)

Available Formats

Format Size Use Case
Safetensors (16-bit) ~14.5 GB Full precision inference
GGUF Q8_0 ~7.7 GB High-quality local inference
GGUF Q4_K_M ~4.37 GB Balanced quality/size
GGUF Q4_0 ~4.11 GB Smaller footprint
GGUF Q3_K_M ~3.52 GB Memory-constrained environments
GGUF Q2_K ~2.72 GB Minimum size

Limitations

  • Geographic Scope: Focused on Bangladeshi students applying to Indian universities
  • Temporal: Information reflects 2025 admission cycles; verify current requirements
  • University Coverage:
    • Primary: Sharda University (most data - scholarships, programs, eligibility)
    • Secondary: Galgotias, Amity, Noida International University (NIU)
    • Not Covered: IITs (Indian Institutes of Technology), NITs (National Institutes of Technology), Central Universities, and other public/government universities
  • Topic Bias: Heavy emphasis on scholarship eligibility criteria; limited information on campus life, placements, or research opportunities
  • Not Legal Advice: For official processes, always verify with respective authorities

Ethical Considerations

  • This model provides educational guidance and should not replace official university or government sources
  • Users should verify scholarship amounts, eligibility criteria, and visa requirements with official sources
  • The model may occasionally generate plausible but outdated information

Citation

If you use this model or dataset, please cite the research paper:

Paper Citation

@inproceedings{hosen2025contextual,
  author = {MD Millat Hosen and Md Moudud Ahmed Misil and Dr. Rohit Kumar Sachan},
  title = {Development of a Contextual Educational Dataset for Bangladeshi Students Studying in India},
  booktitle = {Proceedings of Sharda University},
  year = {2025},
  address = {Greater Noida, India},
  institution = {School of Computer Science and Engineering (SSCSE), Sharda University},
  keywords = {Bangladeshi, Large language model, Natural language processing, ChatBot, JSON}
}

Model Citation

@misc{hosen2025mistral_indian_uni,
  author = {MD Millat Hosen and Md Moudud Ahmed Misil and Dr. Rohit Kumar Sachan},
  title = {Mistral 7B - Indian University Guidance for Bangladeshi Students},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/millat/mistral-7b-indian-university-guidance},
  doi = {10.57967/hf/7639},
  note = {Fine-tuned with Unsloth + QLoRA on 7,044 domain-specific Q&A pairs}
}

Dataset Citation

@misc{hosen2025indian_university_dataset,
  author = {MD Millat Hosen and Md Moudud Ahmed Misil and Dr. Rohit Kumar Sachan},
  title = {indian_university_guidance_for_bangladeshi_students},
  year = {2025},
  url = {https://huggingface.co/datasets/millat/indian_university_guidance_for_bangladeshi_students},
  doi = {10.57967/hf/6295},
  publisher = {Hugging Face}
}

Acknowledgments

License

Apache 2.0 - See LICENSE for details.


Model Card Last Updated: January 25, 2026

Description
Model synced from source: millat/mistral-7b-indian-university-guidance
Readme 807 KiB
Languages
Jinja 100%