From 6cc1d66cb4b8574a3099bc69799290cc96fcf264 Mon Sep 17 00:00:00 2001 From: Joaquin Dominguez <83036592+j-dominguez9@users.noreply.github.com> Date: Fri, 28 Jun 2024 16:32:30 -0400 Subject: [PATCH] J dominguez9/contributing guide (#29) * add contributing file * Create feature_request.md * Create PULL_REQUEST_TEMPLATE.md * changed PRtemp location * Create bug_report.md * Update to feature_request.md --------- Co-authored-by: Nedelina T --- .env.sample | 2 +- .github/ISSUE_TEMPLATE/bug_report.md | 17 ++ .github/ISSUE_TEMPLATE/feature_request.md | 25 +++ .github/PULL_REQUEST_TEMPLATE.md | 20 ++ CONTRIBUTING.md | 182 ++++++++++++++++++ README.md | 38 ++-- .../sample-texts/data_points_samples.json | 30 ++- 7 files changed, 292 insertions(+), 22 deletions(-) create mode 100644 .github/ISSUE_TEMPLATE/bug_report.md create mode 100644 .github/ISSUE_TEMPLATE/feature_request.md create mode 100644 .github/PULL_REQUEST_TEMPLATE.md create mode 100644 CONTRIBUTING.md diff --git a/.env.sample b/.env.sample index 52ff640..9c35d44 100644 --- a/.env.sample +++ b/.env.sample @@ -1 +1 @@ -OPENAI_API_KEY="sk-xxxxx" # replace "sk-xxxxx" with your secret OpenAI API key +OPENAI_API_KEY="sk-xxxxx" # replace "sk-xxxxx" with your secret OpenAI API key diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 0000000..6331b68 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,17 @@ +--- +name: Bug report +about: Create a bug report +title: '' +labels: bug +assignees: '' + +--- + +* Clearly describe the bug: + +* Instructions for reproducing the bug: + +* Expected behavior: + +* Proposed solution: + diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 0000000..5778a64 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,25 @@ +--- +name: feature request +about: suggested feature +title: '' +labels: feature request +assignees: '' +--- + +**What is the issue about?** +- [ ] improving/extending existing features; +- [ ] issue requires additional research to be implemented + +#### For improving/extending existing features, please include: + - Description of the proposed feature. + - Include a short motivation of why the feature will be useful to users. + +**For feature proposals that require additional research, include:** +* Proposal and background. +* Motivation of why the feature is interesting from reserach perspective and why it will be useful for the users. +* Short outline of experiments, and description of necessary sub components (e.g. data curation, training, evaluation), if needed. +* Related literature or list of references that may be useful in evaluating the reserach proposal. + +We will use github issues to the discuss the proposal. Based on our discussions, the contributor(s) will proceed to implement the feature and will provide guidance/help, as needed. +Once we receive the PRs, we will review them and merge them after we perform necessary verificaiton and fixing any bugs. + diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..68d8bce --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,20 @@ +#### Context +What is the purpose of this PR? Is it to +- [ ] add a new feature +- [ ] fix a bug +- [ ] update tests and/or documentation +- [ ] other (please add here) + +Please link to any issues this PR addresses. + +#### Changelog +What are the changes made in this PR? + +#### Test plan +Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help.) + +- [ ] run pre-commit hooks and linters (make sure you've first installed via `pre-commit install`) +- [ ] add unit tests for any new functionality +- [ ] update docstrings for any new or updated methods or classes +- [ ] run unit tests via `pytest tests` +- [ ] include relevant commands and any other artifacts in this summary (eval results, etc.) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..9231f18 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,182 @@ + +# Contributing to translation-agent + +First off, thanks for taking the time to contribute! + +All types of contributions are encouraged and valued. See the [Table of Contents](#table-of-contents) for different ways to help and details about how this project handles them. Please make sure to read the relevant section before making your contribution. It will make it a lot easier for us maintainers and smooth out the experience for all involved. The community looks forward to your contributions. + +> And if you like the project, but just don't have time to contribute, that's fine. There are other easy ways to support the project and show your appreciation, which we would also be very happy about: +> - Star the project +> - Tweet about it +> - Refer this project in your project's readme +> - Mention the project at local meetups and tell your friends/colleagues + + +## Table of Contents + +- [I Have a Question](#i-have-a-question) +- [I Want To Contribute](#i-want-to-contribute) + - [Reporting Bugs](#reporting-bugs) + - [Suggesting Enhancements](#suggesting-enhancements) + - [Your First Code Contribution](#your-first-code-contribution) + - [Improving The Documentation](#improving-the-documentation) +- [Styleguides](#styleguides) + - [Commit Messages](#commit-messages) + + + + +## I Have a Question + +> If you want to ask a question, we assume that you have read the available [Documentation](https://github.com/andrewyng/translation-agent/blob/main/README.md). + +Before you ask a question, it is best to search for existing [Issues](https://github.com/andrewyng/translation-agent/issues) that might help you. In case you have found a suitable issue and still need clarification, you can write your question in this issue. It is also advisable to search the internet for answers first. + +If you then still feel the need to ask a question and need clarification, we recommend the following: + +- Open an [Issue](https://github.com/andrewyng/translation-agent/issues/new). +- Provide as much context as you can about what you're running into. +- Provide project and platform versions (python, OS, etc.), depending on what seems relevant. + +We (or someone in the community) will then take care of the issue as soon as possible. + + + +## I Want To Contribute + +> ### Legal Notice +> When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license. + +### Reporting Bugs + + +#### Before Submitting a Bug Report + +A good bug report shouldn't leave others needing to chase you up for more information. Therefore, we ask you to investigate carefully, collect information and describe the issue in detail in your report. Please complete the following steps in advance to help us fix any potential bug as fast as possible. + +- Make sure that you are using the latest version. +- Determine if your bug is really a bug and not an error on your side e.g. using incompatible environment components/versions (Make sure that you have read the [documentation](https://github.com/andrewyng/translation-agent/blob/main/README.md). If you are looking for support, you might want to check [this section](#i-have-a-question)). +- To see if other users have experienced (and potentially already solved) the same issue you are having, check if there is not already a bug report existing for your bug or error in the [bug tracker](https://github.com/andrewyng/translation-agentissues?q=label%3Abug). +- Also make sure to search the internet (including Stack Overflow) to see if users outside of the GitHub community have discussed the issue. +- Collect information about the bug: + - Stack trace (Traceback) + - OS, Platform and Version (Windows, Linux, macOS, x86, ARM) + - Version of the interpreter, compiler, SDK, runtime environment, package manager, depending on what seems relevant. + - Possibly your input and the output + - Can you reliably reproduce the issue? And can you also reproduce it with older versions? + + +#### How Do I Submit a Good Bug Report? + +> You must never report security related issues, vulnerabilities or bugs including sensitive information to the issue tracker, or elsewhere in public. Instead sensitive bugs must be sent by email to . + + +We use GitHub issues to track bugs and errors. If you run into an issue with the project: + +- Open an [Issue](https://github.com/andrewyng/translation-agent/issues/new). (Since we can't be sure at this point whether it is a bug or not, we ask you not to talk about a bug yet and not to label the issue.) +- Explain the behavior you would expect and the actual behavior. +- Please provide as much context as possible and describe the *reproduction steps* that someone else can follow to recreate the issue on their own. This usually includes your code. For good bug reports you should isolate the problem and create a reduced test case. +- Provide the information you collected in the previous section. + +Once it's filed: + +- The project team will label the issue accordingly. +- A team member will try to reproduce the issue with your provided steps. If there are no reproduction steps or no obvious way to reproduce the issue, the team will ask you for those steps and mark the issue as `needs-repro`. Bugs with the `needs-repro` tag will not be addressed until they are reproduced. +- If the team is able to reproduce the issue, it will be marked `needs-fix`, as well as possibly other tags (such as `critical`), and the issue will be left to be [implemented by someone](#your-first-code-contribution). + +Please use the issue templates provided. + + + + +### Suggesting Enhancements + +This section guides you through submitting an enhancement suggestion for translation-agent, **including completely new features and minor improvements to existing functionality**. Following these guidelines will help maintainers and the community to understand your suggestion and find related suggestions. + + +#### Before Submitting an Enhancement + +- Make sure that you are using the latest version. +- Read the [documentation](https://github.com/andrewyng/translation-agent/blob/main/README.md) carefully and find out if the functionality is already covered, maybe by an individual configuration. +- Perform a [search](https://github.com/andrewyng/translation-agent/issues) to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one. +- Find out whether your idea fits with the scope and aims of the project. It's up to you to make a strong case to convince the project's developers of the merits of this feature. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing an add-on/plugin library. + + +#### How Do I Submit a Good Enhancement Suggestion? + +Enhancement suggestions are tracked as [GitHub issues](https://github.com/andrewyng/translation-agent/issues). + +- Use a **clear and descriptive title** for the issue to identify the suggestion. +- Provide a **step-by-step description of the suggested enhancement** in as many details as possible. +- **Describe the current behavior** and **explain which behavior you expected to see instead** and why. At this point you can also tell which alternatives do not work for you. +- You may want to **include screenshots and animated GIFs** which help you demonstrate the steps or point out the part which the suggestion is related to. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux. +- **Explain why this enhancement would be useful** to most translation-agent users. You may also want to point out the other projects that solved it better and which could serve as inspiration. + + + +### Your First Code Contribution + +#### Pre-requisites + +You should first [fork](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo) the `translation-agent` repository and then clone your forked repository: + +```bash +git clone https://github.com//translation-agent.git +``` + + + +Once in the cloned repository directory, make a branch on the forked repository with your username and description of PR: +```bash +git checkout -B / +``` + +Please install the development and test dependencies: +```bash +poetry install --with dev,test +``` + +`translation-agent` uses pre-commit to ensure the formatting is consistent: +```bash +pre-commit install +``` + +**Make suggested changes** + +Afterwards, our suite of formatting tests will run automatically before each `git commit`. You can also run these manually: +```bash +pre-commit run --all-files +``` + +If a formatting test fails, it will fix the modified code in place and abort the `git commit`. After looking over the changes, you can `git add ` and then repeat the previous git commit command. + +**Note**: a github workflow will check the files with the same formatter and reject the PR if it doesn't pass, so please make sure it passes locally. + + +#### Testing +`translation-agent` tracks unit tests. Pytest is used to execute said unit tests in `tests/`: + +```bash +pytest tests +``` + +If your code changes implement a new function, please make a corresponding unit test to the `test/*` files. + +#### Contributing Workflow +We actively welcome your pull requests. + +1. Create your new branch from main in your forked repo, with your username and a name describing the work you're completing e.g. user-123/add-feature-x. +2. If you've added code that should be tested, add tests. Ensure all tests pass. See the testing section for more information. +3. If you've changed APIs, update the documentation. +4. Make sure your code lints. + + + +### Improving The Documentation +We welcome valuable contributions in the form of new documentation or revised documentation that provide further clarity or accuracy. Each function should be clearly documented. Well-documented code is easier to review and understand/extend. + +## Styleguides +For code documentation, please follow the [Google styleguide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings). diff --git a/README.md b/README.md index 7f4db9f..c2f3e80 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,22 @@ # Translation Agent: Agentic translation using reflection workflow This is a Python demonstration of a reflection agentic workflow for machine translation. The main steps are: -1. Prompt an LLM to translate a text from `source_language` to `target_language`; -2. Have the LLM reflect on the translation to come up with constructive suggestions for improving it; -3. Use the suggestions to improve the translation. +1. Prompt an LLM to translate a text from `source_language` to `target_language`; +2. Have the LLM reflect on the translation to come up with constructive suggestions for improving it; +3. Use the suggestions to improve the translation. -## Customizability +## Customizability By using an LLM as the heart of the translation engine, this system is highly steerable. For example, by changing the prompts, it is easier using this workflow than a traditional machine translation (MT) system to: - Modify the output's style, such as formal/informal. -- Specify how to handle idioms and special terms like names, technical terms, and acronyms. For example, including a glossary in the prompt lets you make sure particular terms (such as open source, H100 or GPU) are translated consistently. -- Specify specific regional use of the language, or specific dialects, to serve a target audience. For example, Spanish spoken in Latin America is different from Spanish spoken in Spain; French spoken in Canada is different from how it is spoken in France. +- Specify how to handle idioms and special terms like names, technical terms, and acronyms. For example, including a glossary in the prompt lets you make sure particular terms (such as open source, H100 or GPU) are translated consistently. +- Specify specific regional use of the language, or specific dialects, to serve a target audience. For example, Spanish spoken in Latin America is different from Spanish spoken in Spain; French spoken in Canada is different from how it is spoken in France. -**This is not mature software**, and is the result of Andrew playing around with translations on weekends the past few months, plus collaborators (Joaquin Dominguez, Nedelina Teneva, John Santerre) helping refactor the code. +**This is not mature software**, and is the result of Andrew playing around with translations on weekends the past few months, plus collaborators (Joaquin Dominguez, Nedelina Teneva, John Santerre) helping refactor the code. -According to our evaluations using BLEU score on traditional translation datasets, this workflow is sometimes competitive with, but also sometimes worse than, leading commercial offerings. However, we’ve also occasionally gotten fantastic results (superior to commercial offerings) with this approach. We think this is just a starting point for agentic translations, and that this is a promising direction for translation, with significant headroom for further improvement, which is why we’re releasing this demonstration to encourage more discussion, experimentation, research and open-source contributions. +According to our evaluations using BLEU score on traditional translation datasets, this workflow is sometimes competitive with, but also sometimes worse than, leading commercial offerings. However, we’ve also occasionally gotten fantastic results (superior to commercial offerings) with this approach. We think this is just a starting point for agentic translations, and that this is a promising direction for translation, with significant headroom for further improvement, which is why we’re releasing this demonstration to encourage more discussion, experimentation, research and open-source contributions. -If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also [this article in The Batch](https://www.deeplearning.ai/the-batch/building-models-that-learn-from-themselves/) on using LLMs to generate training data.) +If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also [this article in The Batch](https://www.deeplearning.ai/the-batch/building-models-that-learn-from-themselves/) on using LLMs to generate training data.) Comments and suggestions for how to improve this are very welcome! @@ -29,7 +29,7 @@ To get started with `translation-agent`, follow these steps: - The Poetry package manager is required for installation. [Poetry Installation](https://python-poetry.org/docs/#installation) Depending on your environment, this might work: ```bash -pip install poetry +pip install poetry ``` - A .env file with a OPENAI_API_KEY is required to run the workflow. See the .env.sample file as an example. @@ -53,21 +53,19 @@ See examples/example_script.py for an example script to try out. Translation Agent is released under the **MIT License**. You are free to use, modify, and distribute the code for both commercial and non-commercial purposes. -## Ideas for extensions +## Ideas for extensions Here are ideas we haven’t had time to experiment with but that we hope the open-source community will: -- **Try other LLMs.** We prototyped this primarily using gpt-4-turbo. We would love for others to experiment with other LLMs as well as other hyperparameter choices and see if some do better than others for particular language pairs. -- **Glossary Creation.** What’s the best way to efficiently build a glossary -- perhaps using an LLM -- of the most important terms that we want translated consistently? For example, many businesses use specialized terms that are not widely used on the internet and that LLMs thus don’t know about, and there are also many terms that can be translated in multiple ways. For example, ”open source” in Spanish can be “Código abierto” or “Fuente abierta”; both are fine, but it’d better to pick one and stick with it for a single document. -- **Glossary Usage and Implementation.** Given a glossary, what’s the best way to include it in the prompt? -- **Evaluations on different languages.** How does its performance vary in different languages? Are there changes that make it work better for particular source or target languages? (Note that for very high levels of performance, which MT systems are approaching, we’re not sure if BLEU is a great metric.) Also, its performance on lower resource languages needs further study. -- **Error analysis.** We’ve found that specifying a language and a country/region (e.g., “Spanish as colloquially spoken in Mexico”) does a pretty good job for our applications. Where does the current approach fall short? We’re also particularly interested in understanding its performance on specialized topics (like law, medicine) or special types of text (like movie subtitles) to understand its limitations. -- **Better evals.** Finally, we think better evaluations (evals) is a huge and important research topic. As with other LLM applications that generate free text, current evaluation metrics appear to fall short. For example, we found that even on documents where our agentic workflow captures context and terminology better, resulting in translations that our human raters prefer over current commercial offerings, evaluation at the sentence level (using the [FLORES](https://github.com/facebookresearch/flores) dataset) resulted in the agentic system scoring lower on BLEU. Can we design better metrics (perhaps using an LLM to evaluate translations?) that capture translation quality at a document level that correlates better with human preferences? +- **Try other LLMs.** We prototyped this primarily using gpt-4-turbo. We would love for others to experiment with other LLMs as well as other hyperparameter choices and see if some do better than others for particular language pairs. +- **Glossary Creation.** What’s the best way to efficiently build a glossary -- perhaps using an LLM -- of the most important terms that we want translated consistently? For example, many businesses use specialized terms that are not widely used on the internet and that LLMs thus don’t know about, and there are also many terms that can be translated in multiple ways. For example, ”open source” in Spanish can be “Código abierto” or “Fuente abierta”; both are fine, but it’d better to pick one and stick with it for a single document. +- **Glossary Usage and Implementation.** Given a glossary, what’s the best way to include it in the prompt? +- **Evaluations on different languages.** How does its performance vary in different languages? Are there changes that make it work better for particular source or target languages? (Note that for very high levels of performance, which MT systems are approaching, we’re not sure if BLEU is a great metric.) Also, its performance on lower resource languages needs further study. +- **Error analysis.** We’ve found that specifying a language and a country/region (e.g., “Spanish as colloquially spoken in Mexico”) does a pretty good job for our applications. Where does the current approach fall short? We’re also particularly interested in understanding its performance on specialized topics (like law, medicine) or special types of text (like movie subtitles) to understand its limitations. +- **Better evals.** Finally, we think better evaluations (evals) is a huge and important research topic. As with other LLM applications that generate free text, current evaluation metrics appear to fall short. For example, we found that even on documents where our agentic workflow captures context and terminology better, resulting in translations that our human raters prefer over current commercial offerings, evaluation at the sentence level (using the [FLORES](https://github.com/facebookresearch/flores) dataset) resulted in the agentic system scoring lower on BLEU. Can we design better metrics (perhaps using an LLM to evaluate translations?) that capture translation quality at a document level that correlates better with human preferences? -## Related work +## Related work A few academic research groups are also starting to look at LLM-based and agentic translation. We think it’s early days for this field! - *ChatGPT MT: Competitive for High- (but not Low-) Resource Languages*, Robinson et al. (2023), https://arxiv.org/pdf/2309.07423 - *How to Design Translation Prompts for ChatGPT: An Empirical Study*, Gao et al. (2023), https://arxiv.org/pdf/2304.02182v2 - *Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts*, Wu et al. (2024), https://arxiv.org/pdf/2405.11804 - - diff --git a/examples/sample-texts/data_points_samples.json b/examples/sample-texts/data_points_samples.json index 135648f..8055039 100644 --- a/examples/sample-texts/data_points_samples.json +++ b/examples/sample-texts/data_points_samples.json @@ -1 +1,29 @@ -[{"text": "Paid ChatGPT users can now upload files directly from Google Drive and Microsoft OneDrive, interact with tables and charts using natural language, and customize charts for presentations. When users upload or import a data file, ChatGPT can now write and execute Python code to analyze or visualize that data on users\u2019 behalf. These features may make it easier for those with limited coding skills to conduct in-depth analyses and let experts save time on routine data tasks."}, {"text": "Reddit\u2019s vast forums will be used to power ChatGPT and other AI products. The collaboration will give Reddit new AI-powered features for its users and moderators, while OpenAI will advertise on Reddit. (Full terms were undisclosed.) OpenAI now has deals with global newspapers, software forums, and a wide variety of other publishers, giving it special access to timely and high-quality training material."}, {"text": "ZeroGPU is accessible through Hugging Face\u2019s Spaces platform, which already hosts over 300,000 AI demos. The shared Nvidia A100s can be used concurrently by multiple users or applications; unutilized capacity will be made available to others. HuggingFace\u2019s goal is to counter tech giants and closed models\u2019 centralization by making state-of-the-art AI technologies more accessible."}, {"text": "Chameleon can natively process both text and images together, allowing it to perform a wide range of mixed-modal tasks with impressive results. Meta\u2019s researchers say the key is Chameleon\u2019s fully token-based architecture (representing images as well as texts as tokens) and training on datasets that combine text with images. Chameleon outperforms many leading and specialized models (including GPT-4V and Gemini Pro) when answering questions about images, describing pictures, writing relevant text, and creating images from text prompts.\u00a0"}, {"text": "Google\u2019s AI-assisted, browser-based integrated development environment (IDE) offers now-familiar features like code completion, debugging tools, and a chat-assisted sidebar, all powered by Gemini. Whenever IDX modifies snippets or suggests new code, it also links back to the original source and its associated license, ensuring proper attribution. Although Google is entering a competitive market, IDX aims to attract developers by showcasing Gemini\u2019s AI advancements and integrating with the company\u2019s cloud services."}, {"text": "The tool aims to solve new users\u2019 \u201cblank page problem\u201d by providing a starting point for testing and iteration, incorporating best practices like chain of thought and separating data from instructions. Users can access the prompt generator directly on the Console or analyze the underlying prompt and architecture using a Google Colab notebook. The generator addresses a common challenge for AI users: efficiently crafting effective (and often larger and more complex) prompts that yield high-quality results."}, {"text": "ElevenLabs Reader: AI Audio is the billion-dollar AI voice cloning startup\u2019s first consumer app. The free app can read web pages, PDFs, and other documents aloud using a selection of 11 AI-generated voices. The app marks ElevenLabs\u2019 expansion into the broader AI voice market beyond its current focus on entertainment and media production."}, {"text": "Microsoft reportedly asked hundreds of its China-based employees working on cloud computing and AI to consider relocating to other countries. One source said Microsoft offered 700 to 800 Chinese engineers the opportunity to transfer to the U.S., Ireland, Australia, or New Zealand. The move comes as the U.S. government tightens restrictions on China\u2019s access to advanced technology, citing concerns over potential military applications and cybersecurity threats."}, {"text": "Abu Dhabi\u2019s Technology Innovation Institute released Falcon 2, a family of large language models that includes Falcon 2 11B and Falcon 2 11B VLM. The latter is the institute\u2019s first multimodal model, capable of converting visual inputs into textual outputs. Both models are Apache 2.0 open-source, multilingual, and perform on par with Gemma 7B and better than Llama 3 8B according to benchmarks and HuggingFace leaderboards."}] \ No newline at end of file +[ + { + "text": "Paid ChatGPT users can now upload files directly from Google Drive and Microsoft OneDrive, interact with tables and charts using natural language, and customize charts for presentations. When users upload or import a data file, ChatGPT can now write and execute Python code to analyze or visualize that data on users’ behalf. These features may make it easier for those with limited coding skills to conduct in-depth analyses and let experts save time on routine data tasks." + }, + { + "text": "Reddit’s vast forums will be used to power ChatGPT and other AI products. The collaboration will give Reddit new AI-powered features for its users and moderators, while OpenAI will advertise on Reddit. (Full terms were undisclosed.) OpenAI now has deals with global newspapers, software forums, and a wide variety of other publishers, giving it special access to timely and high-quality training material." + }, + { + "text": "ZeroGPU is accessible through Hugging Face’s Spaces platform, which already hosts over 300,000 AI demos. The shared Nvidia A100s can be used concurrently by multiple users or applications; unutilized capacity will be made available to others. HuggingFace’s goal is to counter tech giants and closed models’ centralization by making state-of-the-art AI technologies more accessible." + }, + { + "text": "Chameleon can natively process both text and images together, allowing it to perform a wide range of mixed-modal tasks with impressive results. Meta’s researchers say the key is Chameleon’s fully token-based architecture (representing images as well as texts as tokens) and training on datasets that combine text with images. Chameleon outperforms many leading and specialized models (including GPT-4V and Gemini Pro) when answering questions about images, describing pictures, writing relevant text, and creating images from text prompts. " + }, + { + "text": "Google’s AI-assisted, browser-based integrated development environment (IDE) offers now-familiar features like code completion, debugging tools, and a chat-assisted sidebar, all powered by Gemini. Whenever IDX modifies snippets or suggests new code, it also links back to the original source and its associated license, ensuring proper attribution. Although Google is entering a competitive market, IDX aims to attract developers by showcasing Gemini’s AI advancements and integrating with the company’s cloud services." + }, + { + "text": "The tool aims to solve new users’ “blank page problem” by providing a starting point for testing and iteration, incorporating best practices like chain of thought and separating data from instructions. Users can access the prompt generator directly on the Console or analyze the underlying prompt and architecture using a Google Colab notebook. The generator addresses a common challenge for AI users: efficiently crafting effective (and often larger and more complex) prompts that yield high-quality results." + }, + { + "text": "ElevenLabs Reader: AI Audio is the billion-dollar AI voice cloning startup’s first consumer app. The free app can read web pages, PDFs, and other documents aloud using a selection of 11 AI-generated voices. The app marks ElevenLabs’ expansion into the broader AI voice market beyond its current focus on entertainment and media production." + }, + { + "text": "Microsoft reportedly asked hundreds of its China-based employees working on cloud computing and AI to consider relocating to other countries. One source said Microsoft offered 700 to 800 Chinese engineers the opportunity to transfer to the U.S., Ireland, Australia, or New Zealand. The move comes as the U.S. government tightens restrictions on China’s access to advanced technology, citing concerns over potential military applications and cybersecurity threats." + }, + { + "text": "Abu Dhabi’s Technology Innovation Institute released Falcon 2, a family of large language models that includes Falcon 2 11B and Falcon 2 11B VLM. The latter is the institute’s first multimodal model, capable of converting visual inputs into textual outputs. Both models are Apache 2.0 open-source, multilingual, and perform on par with Gemma 7B and better than Llama 3 8B according to benchmarks and HuggingFace leaderboards." + } +]