Avs speaker proposal

Interactive Ranking Aggregation of multiple results

The Ad-hoc search task necessitates participants to appropriately model a user’s text query to search for video shots that align with the textual description. After referencing last year’s team’s approach, we chose to utilize embedding models and incorporated some of the successful practices from the previous teams. For automatic runs,we applied classic  Language-Image pre-training models CLIP and its various variants:SLIP,BLIP,BLIP-2,LaCLIP. Besides, we also applied Diffusion model to turn text query into abundant generated pictures in order to attain so-called “mean image query”. For those using feedback runs, we used Top-K Feedback and a new algorithm Quantum-Theoretic Interactive Ranking Aggregation (QT-IRA) that adjusts models’ weight with relevance feedback.

For various language-image models, We find some interesting phenomenon. On one hand, the more diverse the types of models, the better the results after fusion. On the other hand, for models of the same kind, the fewer models that perform poorly, the better the results. Besides, inspired by team Waseda_Meisei_SoftBank at 2022, we use  diffusion model turn text query into “mean image query”. In our experiments, this method performed even better in certain queries than CLIP. It can be concluded that comparing within the same modality instead of the same mapping space performs well. We have made attempts in various directions. For example, we tried to design an appropriate prompt to formulate the initial query with ChatGPT. Although intergrating the approach of Chain of Thought(COT) , the improvement in results is still unstable.

With so many models, our primary task is to determine the optimal fusion weights for them. One natural idea is to adjust weights with feedback information. That’s why we propose a new algorithm Quantum-Theoretic Interactive Ranking Aggregation(QT-IRA). However, the improvement in results is limited. Further improving on the quality of interaction is needed, which providing feedback on specific details rather than general information. For example, for the query “a man wears black shorts” and a negative feedback image “a woman wears black shorts”, we hope the model should be conscious that “woman” is wrong rather than “black shorts”.

This line appears after every note.

Notes mentioning this note


Here are all the notes in this garden, along with their links, visualized as a graph.

Conda导出python环境加快访问github新闻稿实验1:ros入门实验3:自动驾驶实战实验4:ros2智能移动机器人实验5:ros1移动机器人动态避障(基于强化学习)实验6:轨迹跟踪仿真1最终实验自动驾驶辅助python函数Obsidian发布的免费替代方案Obsidian库解析TestYour first seedClip 串讲Icml'23 blip 2 bootstrapping language Image pre...Nips'17 attention is all you needSigir'22 cret cross Modal retrieval transformer...Arxiv 2306’unifying large language models and...Arxiv'21 how much can clip benefit vision And...⭐ ⭐ ⭐ ⭐ ⭐ arxiv 2311' llmsurveychinese⭐⭐⭐⭐eccv'22 slip:self Supervision meets language...⭐⭐⭐⭐⭐clip:learning transferable visual models from...⭐⭐⭐⭐⭐icml'22 blip bootstrapping language Image pre...Acl'25 a survey of mathematical reasoning in the...Arxiv'23 challenges and applications of large...Prl'20 retrieving quantum information with active...SIGIR'06 Laplacian Optimal Design for Image...Survey'09active learningTKDE'16Relevance Feedback Algorithms Inspired By...Arxiv'2501 ursa understanding and verifying chain...Cikm'24 infinitymath a scalable instruction tuning...Icml'25 cogmath assessing llms’ authentic...Icml'25 forest Of Thought scaling test Time...⭐⭐⭐arxiv'2502 hinteval a comprehensive framework...⭐⭐⭐arxiv'2506 scida scientific dynamic assessor of...⭐⭐⭐⭐acl'24 champ a competition Level dataset for...⭐⭐⭐⭐arxiv'2505 soft thinking unlocking the...⭐⭐⭐⭐⭐arxiv'2505 reasoning with omnithought a...⭐⭐⭐⭐⭐arxiv'2506 thought anchors which llm...Improving interpretable embeddings for ad Hoc...Access'17...Artif. intell. rev.‘23 a survey on ensemble...Fcs'20 a survey on ensemble learningTpmai'04 asymmetric bagging and random subspace...⭐⭐⭐⭐access'22 a survey of ensemble learning进化集成学习算法综述《黑客与画家》 为什么书呆子不受欢迎《黑客与画家》《黑客与画家》——黑客与画家黑客伦理250701 250708 阅读250709 250716 阅读250717 250723 阅读250722 250729 阅读250729 250805 阅读Avs检索流程Avs项目管理Avs speaker proposalAvs paper思路整理Presentation 思路整理Stable Diffusion检索流程2023avs交互使用flask快速构建浏览器实现图片交互Trecvid avs 个人感受2022交互情况统计2024avs交互情况统计Llm api测试Agi 比赛Lean(vs code)Agic TrickLlm相关论文Rtx 4090 部署大模型 20240306构建样题数据集调查开源大模型的数学能力想法计划231128调研Github下载Python调用javaVbs2024比赛复盘复现系统talkseeDiffusion扩散模型调研2023 mindspore量子计算黑客马拉松全国大赛热身题2023 mindspore量子计算黑客马拉松初赛——量子组合优化赛道代码集成进化算法Python使用Vscode使用Github问题HuggingfaceLinuxSlrum使用华为手机安装google框架工具推荐科研问题笔记本电脑视频生成调研20241002更换内存条(16g换到32g)24考研总结Reflection 大学四年的回顾及年终总结Fairymusicbox上手李沐讲座干眼症的习惯考研计划《周处除三害》观后感《奥本海默》观后感《白金数据》书评牛奶2023 mindspore量子计算黑客马拉松初赛——量子组合优化赛道排序融合动手学习深度学习算法笔记论文阅读模板算法知识生活Paper ReadingProjectsWeekly Summary