Search Results for author: Szymon Tworkowski

Found 7 papers, 3 papers with code

Analysing The Impact of Sequence Composition on Language Model Pre-Training

1 code implementation • 21 Feb 2024 • Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Miłoś, Yuxiang Wu, Pasquale Minervini

In this work, we find that applying causal masking can lead to the inclusion of distracting information from previous documents during pre-training, which negatively impacts the performance of the models on language modelling and downstream tasks.

In-Context Learning Language Modelling +1

Paper
Code

Structured Packing in LLM Training Improves Long Context Utilization

no code implementations • 28 Dec 2023 • Konrad Staniszewski, Szymon Tworkowski, Yu Zhao, Sebastian Jaszczur, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś

Recent developments in long-context large language models have attracted considerable attention.

Information Retrieval Retrieval

Paper
Add Code

Explaining Competitive-Level Programming Solutions using LLMs

no code implementations • 11 Jul 2023 • Jierui Li, Szymon Tworkowski, Yingying Wu, Raymond Mooney

In this paper, we approach competitive-level programming problem-solving as a composite task of reasoning and code generation.

Code Generation Explanation Generation

Paper
Add Code

Focused Transformer: Contrastive Training for Context Scaling

1 code implementation • NeurIPS 2023 • Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

This novel approach enhances the structure of the (key, value) space, enabling an extension of the context length.

Contrastive Learning

1,432

Paper
Code

Magnushammer: A Transformer-Based Approach to Premise Selection

no code implementations • 8 Mar 2023 • Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

By combining \method with a language-model-based automated theorem prover, we further improve the state-of-the-art proof success rate from $57. 0\%$ to $71. 0\%$ on the PISA benchmark using $4$x fewer parameters.

Automated Theorem Proving Language Modelling +1

Paper
Add Code

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

no code implementations • 22 May 2022 • Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8. 2\%$ of problems neither language models nor automated theorem provers are able to solve on their own.

Ranked #3 on Automated Theorem Proving on miniF2F-test

Automated Theorem Proving

Paper
Add Code

Hierarchical Transformers Are More Efficient Language Models

3 code implementations • Findings (NAACL) 2022 • Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski

Transformer models yield impressive results on many NLP and sequence modeling tasks.

Ranked #4 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation Language Modelling

49,393

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.