• NeurIPS 2024 Wrapped 🌯

  • Dec 30 2024
  • Length: 1 hr and 27 mins
  • Podcast

NeurIPS 2024 Wrapped 🌯

  • Summary

  • What happens when you bring over 15,000 machine learning nerds to one city? If your guess didn't include racism, sabotage and scandal, belated epiphanies, a spicy SoLaR panel, and many fantastic research papers, you wouldn't have captured my experience. In this episode we discuss the drama and takeaways from NeurIPS 2024.Posters available at time of episode preparation can be found on the episode webpage.EPISODE RECORDED 2024.12.08(00:00) - Recording date (00:05) - Intro (00:44) - Obligatory mentions (01:54) - SoLaR panel (18:43) - Test of Time (24:17) - And now: science! (28:53) - Downsides of benchmarks (41:39) - Improving the science of ML (53:07) - Performativity (57:33) - NopenAI and Nanthropic (01:09:35) - Fun/interesting papers (01:13:12) - Initial takes on o3 (01:18:12) - WorkArena (01:25:00) - OutroLinksNote: many workshop papers had not yet been published to arXiv as of preparing this episode, the OpenReview submission page is provided in these cases. NeurIPS statement on inclusivityCTOL Digital Solutions article - NeurIPS 2024 Sparks Controversy: MIT Professor's Remarks Ignite "Racism" Backlash Amid Chinese Researchers’ Triumphs(1/2) NeurIPS Best Paper - Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale PredictionVisual Autoregressive Model report this link now provides a 404 errorDon't worry, here it is on archive.isReuters article - ByteDance seeks $1.1 mln damages from intern in AI breach case, report saysCTOL Digital Solutions article - NeurIPS Award Winner Entangled in ByteDance's AI Sabotage Accusations: The Two Tales of an AI GeniusReddit post on Ilya's talkSoLaR workshop pageReferenced SourcesHarvard Data Science Review article - Data Science at the SingularityPaper - Reward Reports for Reinforcement LearningPaper - It's Not What Machines Can Learn, It's What We Cannot TeachPaper - NeurIPS Reproducibility ProgramPaper - A Metric Learning Reality CheckImproving Datasets, Benchmarks, and MeasurementsTutorial video + slides - Experimental Design and Analysis for AI Researchers (I think you need to have attended NeurIPS to access the recording, but I couldn't find a different version)Paper - BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best PracticesPaper - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?Paper - A Systematic Review of NeurIPS Dataset Management PracticesPaper - The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks TrackPaper - Benchmark Repositories for Better BenchmarkingPaper - Croissant: A Metadata Format for ML-Ready DatasetsPaper - Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites ParadoxPaper - Evaluating Generative AI Systems is a Social Science Measurement ChallengePaper - Report Cards: Qualitative Evaluation of LLMsGovernance RelatedPaper - Towards Data Governance of Frontier AI ModelsPaper - Ways Forward for Global AI Benefit SharingPaper - How do we warn downstream model providers of upstream risks?Unified Model Records toolPaper - Policy Dreamer: Diverse Public Policy Creation via Elicitation and Simulation of Human PreferencesPaper - Monitoring Human Dependence on AI Systems with Reliance DrillsPaper - On the Ethical Considerations of Generative AgentsPaper - GPAI Evaluation Standards Taskforce: Towards Effective AI GovernancePaper - Levels of Autonomy: Liability in the age of AI AgentsCertified Bangers + Useful ToolsPaper - Model Collapse Demystified: The Case of RegressionPaper - Preference Learning Algorithms Do Not Learn Preference RankingsLLM Dataset Inference paper + repodattri paper + repoDeTikZify paper + repoFun Benchmarks/DatasetsPaloma paper + datasetRedPajama paper + datasetAssemblage webpageWikiDBs webpageWhodunitBench repoApeBench paper + repoWorkArena++ paperOther SourcesPaper - The Mirage of Artificial Intelligence Terms of Use Restrictions
    Show More Show Less
activate_Holiday_promo_in_buybox_DT_T2

What listeners say about NeurIPS 2024 Wrapped 🌯

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.