- SUM: Saliency Unification through Mamba for Visual Attention Modeling
Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati
I'm
I received a B.Sc. degree in Electrical Engineering from Iran University of Science and Technology and completed my Master of Science in Telecommunication Systems at the University of Tehran in September 2024. I currently work as a research assistant at the Computation and Communication Lab at the University of Tehran and as an AI developer at AVIR AI Center. My fields of interest include Machine Learning, Deep Learning, Signal Processing, and related areas. It is always my pleasure to discuss topics related to research. Feel free to email me! [My CV]
Master's in Telecommunication Systems
University of Tehran
Sep. 2022 - Sep. 2024
Bachelor's in Electrical Engineering
Iran University of Science and Technology
Sep. 2017 - Mar. 2022
AI Developer, AVIR AI Center
Tehran, Iran
Jul. 2022 - Present
AI Developer, University of Tehran
Jan. 2023 - Nov. 2023
AI Developer, Irankhodro Powertrain Co
Tehran, Iran
Jul. 2021 - Jul. 2022
Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati
Alireza Hosseini, Kiana Hooshanfar, Pouria Omrani, Reza Toosi, Ramin Toosi, Zahra Ebrahimian, Mohammad Ali Akhaee
Amirhossein Kazerouni, Reza Azad, Alireza Hosseini, Dorit Merhof, Ulas Bagci
Pouria Omrani, Alireza Hosseini, Kiana Hooshanfar, Zahra Ebrahimian, Ramin Toosi, Mohammad Ali Akhaee
Conference IEEE ICWR 2024
PaperAlireza Hosseini, Matine Hajyan, Ramin Toosi, Mohammad Ali Akhaee
Conference IEEE ICWR 2023
PaperAshkan Moosavian, Alireza Hosseini, Seyed Mohammad Jafari, Iman Chitsaz, Shahriar Baradaran Shokouhi
Journal Automotive Science and Engineering 2022
PaperAlireza Hosseini, Moosavian Ashkan, Saeed Javan, Shahriar B Shokouhi
The Journal of Engine Research 2022
Dec 2024 - Jan 2025
Adak Vira Iranian Rahjoo (Avir)
Designed and implemented a system for precise information extraction from documents by fine-tuning and quantizing the NVILA vision-language model (using LLM-AWQ). This solution converts user queries into JSON outputs with high accuracy.
Oct 2024 - Dec 2024
Adak Vira Iranian Rahjoo (Avir)
Developed an end-to-end solution for seamless database interaction using LLMs to convert natural language into SQL queries.
Aug 2024 - Dec 2024
Adak Vira Iranian Rahjoo (Avir)
Utilized diffusion models with prompt engineering to generate logos from company descriptions, styles, and colors.
Jan 2023 - Nov 2024
Adak Vira Iranian Rahjoo (Avir)
Developed AI engines for eye tracking, saliency prediction, and brand attention detection for video advertisement analysis.
Sep 2024 - Oct 2024
Adak Vira Iranian Rahjoo (Avir)
Modified XLSTM models to predict data points and trends over the next 7 days.
Mar 2024 - Oct 2024
Adak Vira Iranian Rahjoo (Avir)
Automated text-to-speech conversion, video synchronization for slides, and integrated talking avatars.
Apr 2024 - Sep 2024
Adak Vira Iranian Rahjoo (Avir)
Developed retrieval-augmented generation (RAG) and fine-tuning for Persian language assistants and chatbots.
Aug 2024
Adak Vira Iranian Rahjoo (Avir)
Extracts financial data from salary documents (PDFs/images) using LLMs for verification.
Jun 2024 - Jul 2024
Adak Vira Iranian Rahjoo (Avir)
Automated subtitle generation using a custom ASR model to transcribe video speech.
May 2024 - Jun 2024
Adak Vira Iranian Rahjoo (Avir)
Modified the SAD-Talker code (CVPR2024) to generate facial landmarks from audio.
Apr 2024 - Jun 2024
Adak Vira Iranian Rahjoo (Avir)
Processed and structured text data from books for effective RAG model input.
Mar 2024 - May 2024
University of Tehran
Developed a Mamba encoder-based model for reconstructing masked signals in sequence data.
Jan 2024 - Feb 2024
Adak Vira Iranian Rahjoo (Avir)
Implemented a diarization solution by modifying the Pyannote-Audio framework.
Nov 2023 - Feb 2024
Adak Vira Iranian Rahjoo (Avir)
Trained and validated Whisper ASR on a custom Persian dataset for enhanced speech recognition.
Sep 2023 - Oct 2023
Adak Vira Iranian Rahjoo (Avir)
Converted Persian PDF documents to Word using Tesseract OCR for accurate text extraction.
Dec 2022 - May 2023
Adak Vira Iranian Rahjoo (Avir)
Applied GANs to transform images and videos into cartoon-like visuals.
Feb 2023 - Mar 2023
Adak Vira Iranian Rahjoo (Avir)
Developed a method for distinguishing vowels using frequency response curves for accurate lip syncing.
Aug 2022 - Nov 2022
Adak Vira Iranian Rahjoo (Avir)
Analyzed social media data and brand campaigns to evaluate campaign effectiveness and impact.
Jul 2021 - Nov 2022
University of Tehran
Developed and optimized AI engines for OCR including preprocessing, postprocessing, and dataset management.
Supervisor: Dr. Mohammad Ali Akhaee
Oct 2022 - Nov 2022
Adak Vira Iranian Rahjoo (Avir)
Implemented an ASR module and used Levenshtein distance to compare spoken and written content in videos.
Oct 2020 - Jan 2021
Iran University of Science and Technology
Developed a system for recognizing car license plates using classic image processing techniques in MATLAB.
[Jan. 2025] Accepted paper, SUM as Oral Presentation at WACV'25.
[Nov. 2024] Accepted paper, Multi-Task Framework for Hand Images using Mamba at IKT'15.
[Sep. 2024] Defended Master thesis at University of Tehran.
[Aug. 2024] Accepted paper, SUM at WACV'25.
[Jul. 2024] Preprint Arxiv paper, SUM.
[May. 2024] Accepted paper, Hybrid RAG Approach for LLMs at ICWR'24.
[Mar. 2024] Preprint Arxiv paper, Brand Visibility in Packaging.
When I’m not immersed in research or development, I dedicate my time to activities that inspire creativity, bring relaxation, and keep me energized.
Reading books opens up new perspectives and sparks creativity.
Listening to music and watching movies are great ways for me to relax and enjoy creative storytelling.
Playing the guitar and exploring other instruments helps me connect with my artistic side.
Exploring and capturing the beauty of nature through photography, mountain climbing, hiking, and outdoor adventures.
Playing football, volleyball, and swimming, watching sports, and staying active with regular workouts keep me energized.
I enjoy expressing my thoughts and creativity through writing, exploring literature, and crafting meaningful stories as a writer.
For inquiries or feedback on my research, don't hesitate to reach out. I’m always happy to hear from you and exchange ideas.