Requesting support for IBM's OpenSource Granite models #441

q5sys · 2024-05-09T00:34:46Z

These open source models were just released yesterday at Red Hat Summit.
https://huggingface.co/ibm-granite
https://arxiv.org/abs/2405.04324

If this ends up being a bigger ask than I think it is, and there's something I can do to help in making this happen, let me know.

danielhanchen · 2024-05-09T17:18:04Z

Oh interesting!

junzzhu · 2024-05-26T17:35:03Z

Fine tuning for both ibm-granite/granite-3b-code-instruct and ibm-granite/granite-8b-code-base is working now as far as I checked with Llama3 Colab notebook, with training loss decreasing as expected. However, inference outputs are both useless still.

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Continue the fibonnaci sequence.

### Input:
1, 1, 2, 3, 5, 8

### Response:
1#<fim_prefix>A
# str
 growth
 for
 for
 for
 for





  `



  `
 `
 
 
           
 ` ` ` ` ` ` ` ` ` `                                                           9\ `<fim_prefix><fim_prefix><fim_prefix><fim_prefix>

q5sys · 2024-05-28T12:54:18Z

I noticed the other day when I was attempting to quantize the 34B larger models that the Granite models are 2 different types. The 3B,7B, and 8B models are llama, while the 20B and 34B are gpt-bigcode models. Not sure how that would or wouldn't affect fine tuning since i haven't looked into it yet, but I figured it was worth mentioning.

danielhanchen · 2024-05-28T14:35:29Z

@q5sys So if its other model types, then well error out for now.

@junzzhu Oh wait its a Code model, so finetuning on text might not work as expected. Hence the weird output

junzzhu · 2024-05-29T04:27:28Z

Oh wait its a Code model, so finetuning on text might not work as expected. Hence the weird output

Cool! That helps. With 7b-base model, the output is meaningful now. Thanks @danielhanchen

danielhanchen · 2024-05-29T06:28:36Z

Great it worked!!

danielhanchen added the currently fixing Am fixing now! label May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting support for IBM's OpenSource Granite models #441

Requesting support for IBM's OpenSource Granite models #441

q5sys commented May 9, 2024

danielhanchen commented May 9, 2024

junzzhu commented May 26, 2024

q5sys commented May 28, 2024

danielhanchen commented May 28, 2024

junzzhu commented May 29, 2024

danielhanchen commented May 29, 2024

Requesting support for IBM's OpenSource Granite models #441

Requesting support for IBM's OpenSource Granite models #441

Comments

q5sys commented May 9, 2024

danielhanchen commented May 9, 2024

junzzhu commented May 26, 2024

q5sys commented May 28, 2024

danielhanchen commented May 28, 2024

junzzhu commented May 29, 2024

danielhanchen commented May 29, 2024