bug: output streaming rail chunk formatting improvements

### Did you check docs and existing issues?

- [x] I have read all the NeMo-Guardrails docs
- [x] I have updated the package to the latest version before submitting this issue
- [ ] (optional) I have used the develop branch
- [x] I have searched the existing issues of NeMo-Guardrails

### Python version (python --version)

python 3.12.3

### Operating system/version

Linux 24.04

### NeMo-Guardrails version (if you must use a specific version and not the latest

0.13

### Describe the bug

When tokens are popped from the buffer to generate a chunk for execute the output rails, an additional [space is introduced](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/rails/llm/llmrails.py#L1300). 
This is a problem when the LLM uses sub-word tokenizers such as Byte Pair Encoding (BPE), WordPiece, and SentencePiece. If a word is composed by multiple tokens, the word will be decomposed in its sub-tokens. For example, on gpt4-mini the word "assisting" is composed by the token ["ass" "isting"] which in the output-rail prompt  became "ass isting"; indeed triggering the policy vaiolation. 

### Steps To Reproduce

```
async def main():
    config = RailsConfig.from_path()
    rails = LLMRails(config, verbose=True)

    history = [{"role": "user", "content": "what can you do for me ?"}]

    async def stream_chat():
        async for chunk in rails.stream_async(messages=history,):
            print(f"CHUNK: {chunk}")

    await stream_chat()
    rails.explain().print_llm_calls_summary()


if __name__ == "__main__":
    asyncio.run(main())
```

config

```
models:

  - type: main
    engine: azure_openai
    model: gpt-4.1-mini
    parameters:
      azure_deployment: gpt-4.1-mini
      api_version: 2024-12-01-preview
      presence_penalty: 0.0
      frequency_penalty: 0.0
      max_tokens: 1024
      streaming: True
      stream_usage: True

passthrough: True
lowest_temperature: 0.

rails:
  input:
    flows:
      - self check input
  output:
    flows:
      - self check output
    streaming:
      chunk_size: 15
      context_size: 10
      stream_first: True
      enabled: True
```

### Expected Behavior

No problem in the execution
```
1. Task `self_check_input` took 0.51 seconds and used 133 tokens.
2. Task `self_check_output` took 0.47 seconds and used 167 tokens.
3. Task `self_check_output` took 0.46 seconds and used 168 tokens.
4. Task `self_check_output` took 0.49 seconds and used 167 tokens.
5. Task `self_check_output` took 0.46 seconds and used 168 tokens.
6. Task `self_check_output` took 0.49 seconds and used 168 tokens.
7. Task `self_check_output` took 0.48 seconds and used 167 tokens.
8. Task `self_check_output` took 0.54 seconds and used 168 tokens.
9. Task `self_check_output` took 1.18 seconds and used 167 tokens.
10. Task `self_check_output` took 0.49 seconds and used 168 tokens.
11. Task `self_check_output` took 0.70 seconds and used 168 tokens.
12. Task `self_check_output` took 0.48 seconds and used 168 tokens.
13. Task `self_check_output` took 0.50 seconds and used 168 tokens.
14. Task `self_check_output` took 0.51 seconds and used 168 tokens.
15. Task `self_check_output` took 0.47 seconds and used 168 tokens.
16. Task `self_check_output` took 0.50 seconds and used 168 tokens.
17. Task `self_check_output` took 0.47 seconds and used 168 tokens.
18. Task `self_check_output` took 0.45 seconds and used 168 tokens.
19. Task `self_check_output` took 0.47 seconds and used 168 tokens.
20. Task `self_check_output` took 0.50 seconds and used 167 tokens.
21. Task `self_check_output` took 0.54 seconds and used 167 tokens.
22. Task `self_check_output` took 0.51 seconds and used 167 tokens.
23. Task `self_check_output` took 0.61 seconds and used 167 tokens.
24. Task `self_check_output` took 0.46 seconds and used 168 tokens.
25. Task `self_check_output` took 0.80 seconds and used 168 tokens.
26. Task `self_check_output` took 0.61 seconds and used 167 tokens.
27. Task `self_check_output` took 0.44 seconds and used 168 tokens.
28. Task `self_check_output` took 0.45 seconds and used 168 tokens.
29. Task `self_check_output` took 0.48 seconds and used 167 tokens.
30. Task `self_check_output` took 0.49 seconds and used 167 tokens.
31. Task `self_check_output` took 0.51 seconds and used 167 tokens.
32. Task `self_check_output` took 0.47 seconds and used 168 tokens.
33. Task `self_check_output` took 0.51 seconds and used 167 tokens.
34. Task `self_check_output` took 0.48 seconds and used 162 tokens.
```

### Actual Behavior

output rails block the generation

```
LLM Prompt (86e3a..) - self_check_output
                                                                                                                                                                                                                                                                                                                            
User                                                                                                                                                                                                                                                                                                                        
Your task is to check if the bot message below complies with the company policy.                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                                                            
Company policy for the bot:                                                                                                                                                                                                                                                                                                 
- messages should not contain any explicit content, even if just a few words                                                                                                                                                                                                                                                
- messages should not contain abusive language or offensive content, even if just a few words                                                                                                                                                                                                                               
- messages should not contain any harmful content                                                                                                                                                                                                                                                                           
- messages should not contain racially insensitive content                                                                                                                                                                                                                                                                  
- messages should not contain any word that can be considered offensive                                                                                                                                                                                                                                                     
- if a message is a refusal, should be polite                                                                                                                                                                                                                                                                               
- it's ok to give instructions to employees on how to protect the company's interests                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                                                            
Bot message: "ing  questions  and  providing  explanations  on  various  topics                                                                                                                                                                                                                                             
 -  Ass isting  with  writing ,"                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                                                            
Question: Should the message be blocked (Yes or No)? You must respond with `Yes` or `No`.                                                                                                                                                                                                                                   


LLM Completion (86e3a..)
Yes                                                                                                                                                                                                                                                                                                                         

09:12:37.423 | Output Stats None
09:12:37.423 | LLM call took 0.44 seconds
Traceback (most recent call last):
  File "/nemoguardrails/nemoguardrails-configs/src/amdocs/amdocs_input_output_stream_concurrent_run.py", line 142, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/nemoguardrails/nemoguardrails-configs/src/amdocs/amdocs_input_output_stream_concurrent_run.py", line 133, in main
    await generate_with_rails(
  File "/nemoguardrails/nemoguardrails-configs/src/amdocs/amdocs_input_output_stream_concurrent_run.py", line 95, in generate_with_rails
    await consumer_task
  File "/nemoguardrails/nemoguardrails-configs/src/amdocs/amdocs_input_output_stream_concurrent_run.py", line 49, in consume_stream
    raise Exception(data["error"])
Exception: {'message': 'Blocked by self check output rails.', 'type': 'guardrails_violation', 'param': 'self check output', 'code': 'content_blocked'}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: output streaming rail chunk formatting improvements #1197

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

Steps To Reproduce

Expected Behavior

Actual Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: output streaming rail chunk formatting improvements #1197

Description

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

Steps To Reproduce

Expected Behavior

Actual Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions