Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

Open
nathancartlidge opened this issue May 22, 2024 · 27 comments
Open
Labels
Idea-Enhancement New feature or request on an existing product Product-Advanced Paste Refers to the Advanced Paste module Tracker Issue that is used to collect multiple sub-issues about a feature

Comments

@nathancartlidge
Copy link
Contributor

Description of the new feature / enhancement

It should be possible to configure the model used (currently fixed as gpt3.5-turbo) and endpoint (currently fixed as OpenAI's) to arbitrary values

Scenario when this would be used?

Sending requests to an alternative AI endpoint (eg a local model, internal company hosted models, alternative ai providers), or ensuring higher-quality conversions (eg by pointing requests at gpt-4o)

Supporting information

Microsoft's documentation appears to suggest that the underlying library used for AI completions supports other libraries, it just needs to be provided with an endpoint.

The currently used model is a hardcoded string in this repository

@nathancartlidge nathancartlidge added the Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams label May 22, 2024
@htcfreek htcfreek added Idea-Enhancement New feature or request on an existing product Product-Advanced Paste Refers to the Advanced Paste module labels May 22, 2024
@minzdrav
Copy link

minzdrav commented May 23, 2024

It would be nice to have local models too.
For example: https://ollama.com/
It supports Llama 3, Phi3, and a lot of other models: https://ollama.com/library
C# client: https://github.com/awaescher/OllamaSharp

@nathancartlidge
Copy link
Contributor Author

@minzdrav This would be enabled by my proposed change - Ollama provides partial support for the OpenAI API schema, so you'd be able to point the plugin at your local model

@wcwong
Copy link

wcwong commented May 23, 2024

In particular, supporting an Azure OpenAI endpoint would be great first implementation. It would be even better if the Azure implementation supported Managed Identities so we don't end up with the unmanageable mess of API key distribution and rotation.

@htcfreek htcfreek added the Tracker Issue that is used to collect multiple sub-issues about a feature label May 24, 2024
@joadoumie joadoumie removed the Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams label Jun 7, 2024
@AmirH-Amini
Copy link

supporting Groq would be nice too

@htcfreek
Copy link
Collaborator

IMPORTANT
Regarding the custom AI model option planned: We should make sure that companies can (still) force opt-out using Group Policies. And I think it would be great, if companies could enforce a list of supported endpoints by Group Policy.

@wellmorq
Copy link

wellmorq commented Jul 3, 2024

bump...

@alexonpeace
Copy link

bump

@tjtanaa
Copy link

tjtanaa commented Jul 19, 2024

Has anyone started working on this item?

@nathancartlidge
Copy link
Contributor Author

nathancartlidge commented Jul 19, 2024

Has anyone started working on this item?

To my knowledge, no

The basics should be pretty easy to implement though! All you'd need to do to allow for a different api-compatible host and model is add two text fields to the settings page (model, URL) and link them in exactly the same way that the chatgpt token field is currently linked into the app (as far as I know, they are just additional inputs into the same function in the associated library)

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

I'd be happy to take a look, but I won't be able to for at least a week so you may be better placed than me.

@htcfreek
Copy link
Collaborator

@nathancartlidge , @tjtanaa
Directly started to implement this feature: No. But @CrazeXD and @joadoumie are working on #33109 and as I imagine their plans also include this issue or at least depends on it.

@nathancartlidge

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

Are you referring to my comment regarding the Group Policies above.

@nathancartlidge
Copy link
Contributor Author

Yeah, that's what I was referring to! It's a great addition, but also the kind of thing I'd completely overlook when building this sort of feature :)

I hadn't seen that thread before, thanks for bringing it up - from a cursory reading it does look like their work could currently be independent from this, as it seems to be exclusively non-ai features - however, I agree that it could make sense to combine them for the sake of reduced development overheads.

@tjtanaa
Copy link

tjtanaa commented Jul 19, 2024

Has anyone started working on this item?

To my knowledge, no

The basics should be pretty easy to implement though! All you'd need to do to allow for a different api-compatible host and model is add two text fields to the settings page (model, URL) and link them in exactly the same way that the chatgpt token field is currently linked into the app (as far as I know, they are just additional inputs into the same function in the associated library)

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

I'd be happy to take a look, but I won't be able to for at least a week so you may be better placed than me.

Thank you very much for the suggestions @nathancartlidge .

I have a prototype version which leads me to think there are some changes that I am thinking of making. I would be great if I could get some inputs. I am planning to target local LLM Usecase on PC without dedicated GPU. (In most cases, there is only enough resources to host one model at a time).

  1. I found that the Azure OpenAIClient is not that compatible with some OpenAI-Compatible API. I am thinking of implemented a simple class that invokes the /v1/completions or /v1/chat/completions.
  2. Moreover, many opensource models are mainly chat/instruct models. Chat completion endpoint handles the prompt template of the models. Thus, I am thinking of adding an additional function (private Response<ChatCompletions> GetAIChatCompletion(string systemInstructions, string userMessage)) to class AICompletionHelper that calls chat completion endpoint instead for other custom endpoints.
  3. Within the private Response<Completions> GetAICompletion(string systemInstructions, string userMessage), the model is automatically through /v1/models endpoints.
  4. On the setting page, the users is able to see which endpoint it is pointing to it. (Default to OpenAI service endpoint)

Other feature improvements would be adding some common usecases as quick access on menu, such as

  • Explain
  • Summarise
  • Keypoint

Moreover, I also saw that there is a branch dev/crloewen/advancedpaste-v2improvements that has been adding more features in. Moreover, this feature has been previewed in the official channel (e.g. I saw a youtube video about it). But this branch seems to have been stale for 2 months.
If I am going to start working on it, do I start from this branch?

I am new to Group Policies. How is this feature implemented?

@htcfreek
Copy link
Collaborator

htcfreek commented Jul 19, 2024

@nathancartlidge , @tjtanaa
I think we should ask the core team (@crutkas , @ethanfangg, @jaimecbernardo) and of course @craigloewen-msft if it makes sense that you spent time on it or if they are already working on it.

@tjtanaa
We can assist later with implementing the Group Policies. At the end you have to define them in xml fikes, read the registry value /common/utils/gpo.h and act based on the value in the module code. As there already exists a policy to disable paste with ai, you can look at its implementation.

@tjtanaa
Copy link

tjtanaa commented Jul 19, 2024

@nathancartlidge , @tjtanaa I think we should ask the core team (@crutkas , @ethanfangg, @jaimecbernardo) and of course @craigloewen-msft if it makes sense that you spent time on it or if they are already working on it.

@tjtanaa We can assist later with implementing the Group Policies. At the end you have to define them in xml fikes, read the registry value /common/utils/gpo.h and act based on the value in the module code. As there already exists a policy to disable paste with ai, you can look at its implementation.

Sure. That's a better approach. Let's get the inputs from the core team first.

Does it totally disable paste-with-ai feature or it will restrict which llm service endpoints that a user can use?

@htcfreek
Copy link
Collaborator

htcfreek commented Jul 19, 2024

Does it totally disable paste-with-ai feature or it will restrict which llm service endpoints that a user can use?

Currently it disable totally and is based on the name expected to disable Online AI feature.

But I can imagine that we add two or three new policies:

  • Disable use of local ai features. (As addition to disabling online ai.)
  • Configure AI endpoint. (To force an explicit endpoint.)
  • List if supported ai endpoints. (That admindms can restrict the list of available ones.)

I can help later with implementing this.

@CrazeXD
Copy link

CrazeXD commented Jul 19, 2024

Yeah, that's what I was referring to! It's a great addition, but also the kind of thing I'd completely overlook when building this sort of feature :)

I hadn't seen that thread before, thanks for bringing it up - from a cursory reading it does look like their work could currently be independent from this, as it seems to be exclusively non-ai features - however, I agree that it could make sense to combine them for the sake of reduced development overheads.

@htcfreek @nathancartlidge Just to let you know as shown in the prototype video in #33109, model selection would be an option. I don't know if custom endpoints were specifically shown, but I believe it was part of the implementation plan.

@elebumm
Copy link

elebumm commented Jul 19, 2024

Tuning in here! I started taking a swing at it and have it pretty much working well with Ollama. I used Phi-3 mini and the results are great on my Nvidia 4090.

Happy to share my results if interested.

@tjtanaa
Copy link

tjtanaa commented Jul 23, 2024

Tuning in here! I started taking a swing at it and have it pretty much working well with Ollama. I used Phi-3 mini and the results are great on my Nvidia 4090.

Happy to share my results if interested.

Is the development in your fork?

@CrazeXD
Copy link

CrazeXD commented Jul 23, 2024

Tuning in here! I started taking a swing at it and have it pretty much working well with Ollama. I used Phi-3 mini and the results are great on my Nvidia 4090.

Happy to share my results if interested.

Once this is done, we can begin merging the idea of custom presets discussed in #33109 with the different AI models, as well as the offline features that are directly baked in.

@vkulk094
Copy link

vkulk094 commented Aug 2, 2024

I tried adding my Google Gemini API key to the AI paste feature but it does not work. I might just try the OpenAI key for 5 bucks and see how this feature works.

Really enjoying PowerToys thus far, thank you!

@CrazeXD
Copy link

CrazeXD commented Aug 3, 2024

I tried adding my Google Gemini API key to the AI paste feature but it does not work. I might just try the OpenAI key for 5 bucks and see how this feature works.

Really enjoying PowerToys thus far, thank you!

This feature has not been added yet.

@CrazeXD
Copy link

CrazeXD commented Aug 10, 2024

@elebumm Could you please share your code? I was hoping to start working on some of the functionality mentioned in #33109 .

@Chen1Plus
Copy link

As clipboard may contain important data, such as account and password. Local model should be the default option. Hope to see the feature.

@geekloper
Copy link

Is there any update on this feature? I’m really looking forward to it! 😊

@Aniket-Bhat
Copy link

Wouldn't it be easier to add support for OpenRouter? That should cover most of the popular AI models, and make things easier on the integration too, yes?

@CrazeXD
Copy link

CrazeXD commented Sep 11, 2024

Using OpenRouter requires you to use their credit platform I believe. This would not be useful to people who wish to use their own API keys.

@Zaazu
Copy link

Zaazu commented Sep 12, 2024

I'd like to also advocate for Ollama support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Idea-Enhancement New feature or request on an existing product Product-Advanced Paste Refers to the Advanced Paste module Tracker Issue that is used to collect multiple sub-issues about a feature
Projects
Status: No status
Development

No branches or pull requests