Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

octal and hex escapes in char and string declarations require further support #15

Open
markjenkins opened this issue Aug 9, 2018 · 0 comments

Comments

@markjenkins
Copy link

Octal (\NNN) and hex escape (\xNN) sequences in char and string declarations require further support.

Test C code:

#include <stdio.h>

#define NUM_TEST_SINGLE_CHAR 19
char test_single_char[NUM_TEST_SINGLE_CHAR] = {
  '\0',  // octal
  '\00', // octal
  '\000',// octal
  '\x0', // hex
  '\x00',// hex
  
  '\1',  // octal
  '\01', // octal
  '\001',// octal
  '\x1', // hex
  '\x01',// hex

  '\t',
  '\11', // octal
  '\011',// octal
  '\x9', // hex
  '\x09',// hex

  'z',
  '\172',// octal
  '\x7a',// hex
  '\x7A',// hex
};


#define NUM_TEST_STR_SINGLE_CHAR 19
char * test_str_single_char[NUM_TEST_STR_SINGLE_CHAR] = {
  "\0",  // octal
  "\00", // octal
  "\000",// octal
  "\x0", // hex
  "\x00",// hex
  
  "\1",  // octal
  "\01", // octal
  "\001",// octal
  "\x1", // hex
  "\x01",// hex

  "\t",
  "\11", // octal
  "\011",// octal
  "\x9", // hex
  "\x09",// hex

  "z",
  "\172",// octal
  "\x7a",// hex
  "\x7A",// hex
};

#define NUM_TEST_STR_MULTI_CHAR 13
char * test_str_multi_char[] = {
  "Tab\tGap",
  "Tab\11Gap",       // octal
  "Tab\011Gap",      // octal
  "Tab\x9Gap",       // hex, barely works as G is not a hex character
  "Tab\x9" "Gap",    // hex, but more clear and less scary
  "Tab\x09Gap", // hex, two characters
  "Tab\x9" "animals", // hex, restarting the quoting a nescesity
  "zzzAre sleepy time",
  "z\x7a\x7A" "Are sleepy time", // nescessary restarting of quoting
  "z z z Are sleepy time",
  "\172 \x7a \x7A Are sleepy time", // no restart of quoting required
  "time for zzzz",
  "time for \x7a\172\x7Az",
};

int main(){
  int i=0;
  for (i=0; i<NUM_TEST_SINGLE_CHAR; i++){
    // print single in quotes with trailing new line
    // \x for hex and %.2x to print in 2 digit hex format
    printf("'\\x%.2x'\n", test_single_char[i]);
  }
  printf("\n");
  for (i=0; i<NUM_TEST_STR_SINGLE_CHAR; i++){
    // print in double quotes with trailing new line
    // \x for hex and %.2x to print in 2 digit hex format
    printf("\"\\x%.2x\"\n", *test_str_single_char[i]);
  }
  printf("\n");
  
  for (i=0; i<NUM_TEST_STR_MULTI_CHAR; i++){
    printf("%s\n", test_str_multi_char[i]);
  }
  
  return 0;
}

For hex escape sequences this leads to cparser.simple_escape_char being invoked by cpre2_parse() with 'x' as an argument. Hex escape sequences are not of the simple kind that simple_escape_char is designed for. Handling for '\0' and "\0" doesn't recognize that these particular sequences are octal escapes.

Additional states are required in cpre2_parse().

The output of the above should be:

'\x00'
'\x00'
'\x00'
'\x00'
'\x00'
'\x01'
'\x01'
'\x01'
'\x01'
'\x01'
'\x09'
'\x09'
'\x09'
'\x09'
'\x09'
'\x7a'
'\x7a'
'\x7a'
'\x7a'

"\x00"
"\x00"
"\x00"
"\x00"
"\x00"
"\x01"
"\x01"
"\x01"
"\x01"
"\x01"
"\x09"
"\x09"
"\x09"
"\x09"
"\x09"
"\x7a"
"\x7a"
"\x7a"
"\x7a"

Tab	Gap
Tab	Gap
Tab	Gap
Tab	Gap
Tab	Gap
Tab	Gap
Tab	animals
zzzAre sleepy time
zzzAre sleepy time
z z z Are sleepy time
z z z Are sleepy time
time for zzzz
time for zzzz

I have some initial code to address the hex escapes in double quoated strings. After this issue is opened I'll reference the issue number.

markjenkins added a commit to markjenkins/PyCParser that referenced this issue Aug 9, 2018
markjenkins added a commit to markjenkins/PyCParser that referenced this issue Aug 9, 2018
markjenkins added a commit to markjenkins/PyCParser that referenced this issue Aug 10, 2018
markjenkins added a commit to markjenkins/PyCParser that referenced this issue Aug 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant