Parse a simple key=value config file in C ദ

8
\\$\\begingroup\\$

Write a program that should read a file formatted strictly like this below:

$ cat sample

bingo=2
bingo2=939
bingo3=text

The left and right side of = are consecutively parameter name and its respective value. The values can be integer or strings. Each parameter is separated by a newline. The parameter name and its value will not be more than 127 bytes.

This program is fed an input search token, and if found, parse_param_file() will fill the input arguments with string or integer value.

#include <stdio.h>
#include <string.h>
#include <stdlib.h> 

#define TEXT "sample"

/* 
 * Function: parse_param_file
 * Arguments:
 * token    - param name that needs to be looked up
 * buf      - buffer for string value. Pass NULL if unwanted
 * buf_int  - buffer for integer value. Pass NULL if unwanted
 * filename - string formatted path to the file 
 */
int parse_param_file(const char *token, char **buf, int *buf_int,
        const char *filename) {
    if (!token || (!buf && !buf_int)) {
        printf("%s: invalid args\\n", __FUNCTION__);
        return 1;
    }
    FILE *f = fopen(filename, "r");
    if (f == NULL) {
        printf("%s: %s not found\\n", __FUNCTION__, filename);
        return 1;
    }
    char *left = malloc(128);
    char *right = malloc(128);
    char *dlim_ptr, *end_ptr;
    char fbuf[256];
    while (fgets(fbuf, 256, f)) {
        dlim_ptr = strstr(fbuf, "=");
        end_ptr = strstr(dlim_ptr, "\\n");
        strncpy(left, fbuf, dlim_ptr - fbuf);
        strncpy(right, dlim_ptr + 1, end_ptr - dlim_ptr - 1);
        if (strcmp(left, token) == 0) {
            /* Param found */
            if (buf != NULL)
                *buf = strdup(right);
            if (buf_int != NULL)
                *buf_int = atoi(right);
            fclose(f);
            free(left);
            free(right);
            return 0;
        }
    }
    fclose(f);
    printf("%s: Param not found in %s\\n", __FUNCTION__, filename);
    free(left);
    free(right);
    return 1;
}

/* How to use parse_param_file() */
int main(void) {
    char *buf = malloc(128);
    int val = 0;

    /* Test 1 */
    if (parse_param_file("bingo2", NULL, NULL, TEXT))
        fprintf(stderr, "parse_param_file() failed\\n");
    printf("buf : %s\\n", buf);
    printf("int : %d\\n", val);
    /* Test 2 */
    if (parse_param_file("bingo3242", NULL, &val, TEXT))
        fprintf(stderr, "parse_param_file() failed\\n");
    printf("buf : %s\\n", buf);
    printf("int : %d\\n", val);
    /* You must clear *buf and val if you want to reuse them as args again */
    /* Test 3 */
    if (parse_param_file("bingo3", &buf, &val, TEXT))
        fprintf(stderr, "parse_param_file() failed\\n");
    printf("buf : %s\\n", buf);
    printf("int : %d\\n", val);

    free(buf);
    return 0;
}
share|improve this question
New contributor
Sedmaister is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
\\$\\endgroup\\$

3 Answers 3

active oldest votes
6
\\$\\begingroup\\$

Interface

I'm not overly fond of the interface you've defined to the function. I think trying to combine reading integers and reading strings into a single function makes it more difficult to use. For most C code, I think the old guiding principle of UNIX ("do one thing, and do it well") provides excellent guidance. As such, I'd probably have two separate functions, one for reading a string, the other for reading a number.

I haven't tested it to be sure, but at first glance it also looks like if you ask to read a number, it does a poor job of signalling failure. For example, if your file has: foo=0 and you try to read foo as a number, it'll set the number to 0, and return 0. But if the file contains foo=Joe instead, it looks to me like it'll do exactly the same.

I'd prefer that if it couldn't convert the rest of the line after the = to a number that it return 1 to indicate that it has failed.

Memory Allocation

Since you only use our left and right inside of your function, and free them again before returning, there's probably no need to allocate them on the heap. You can just allocate them as local arrays, and use them. The obvious exception to this would be if you were targeting a tiny micro-controller that might well have less than 256 bytes available on the stack. In that case, however, you usually won't have a file system either, so the entire function becomes irrelevant.

Standard Library

I think judicious use of fscanf can simplify the code quite a bit.

For reading a string, it seems like the obvious interface would be to just return the value part of the pair, and if it's not found, return a NULL pointer:

char *read_string(FILE *file, char const *desired_name) { 
    char name[128];
    char val[128];

    while (fscanf(file, "%127[^=]=%127[^\\n]%*c", name, val) == 2) {
        if (0 == strcmp(name, desired_name)) {
            return strdup(val);
        }
    }
    return NULL;
}

When reading a number, we probably need something closer to your original interface, with a return value to indicate success/failure, and receive a pointer to an int where we write the value (if it's found):

int read_int(FILE *file, char const *desired_name, int *ret) {
    char *temp = read_string(file, desired_name);

    char *stop;
    *ret = strtol(temp, &stop, 10);
    int ret_val = stop == NULL || *stop != '\\0';
    free(temp);
    return ret_val;
}

Note that as it stands right now, the return value reflects converting the entire remainder of the line (everything after the =) to the desired number. If part of it can be converted, that part will be written to the destination, but the return will still indicate failure. For example, if you had foo=1234abcd, attempting to read foo as a number would give you 1234, but return 1 because the abcd part couldn't be converted.

It would also be possible to simply ignore any trailing garbage, so foo=1234abcd would read the 1234 part, and return 0 to signal success.

That would actually simplify the code a bit:

int read_int(FILE *file, char const *desired_name, int *ret) {
    char *temp = read_string(file, desired_name);

    char *stop;
    *ret = strtol(temp, &stop, 10);
    free(temp);
    return stop == temp;
}

In this case, we just need to check that at least one character was converted, in which case stop will have be advanced at least one character after the input. Since we don't need to read the content of the strings to determine the return value, we no longer have to do the little dance with determining the return value, then freeing the string, and finally returning the return value.

share|improve this answer
\\$\\endgroup\\$
  • \\$\\begingroup\\$ Given that both functions will be used together, the function read_int could accept the value string as a parameter, and therefore we have one less strdup&free. \\$\\endgroup\\$ – Cacahuete Frito 5 hours ago
3
\\$\\begingroup\\$

Your program crashes when it reads a line from the parameters file that doesn't contain an = sign.

The name of the macro TEXT is misleading. It doesn't contain a text but a filename.

Whenever you output an error message, it belongs on stderr instead of stdout. To do this, replace printf( with fprintf(stderr,.

share|improve this answer
\\$\\endgroup\\$
3
\\$\\begingroup\\$

__func__

Unlike __FUNCTION__, __func__ is standard C since C99. Some compilers might not support C99 and therefore in those compilers you have to use whatever they provide, but if your compiler supports C99, you should use __func__. If you have a crappy compiler, think about using a better one.


magic numbers

What is a magic number, and why is it bad?

Don't use any numbers different than 0, 1, or 2 in your code. The only place where numbers deserve to go is in constant macros like this:

#define MAX_STRLEN   (127)

Note: As Roland Illig pointed out in his comment, this is still a magic number, but it is already much better than the alternative. To improve it, a comment about why 127 and not another number, would be very appropriate just above the definition.


BUFSIZ

When you create a buffer, and you don't need a strict size (when you need a very large buffer, or you are short of memory), you can use BUFSIZ which is defined in <stdio.h>, which is around 1KiB~4KiB.

#include <stdio.h>

...

char fbuf[BUFSIZ];

ARRAY_SIZE()

Use this macro wherever possible.

Typically defined as: #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))

It makes your code safer:

If you ever change the size of an array, it will be updated, while if you hardcoded the size as a magic number, you will have to update every number, and if you hardcoded it as a macro (sometimes, ARRAY_SIZE can't be used, and this is the only possibility), if you change the name of the macro, you will have to update it.

#include <stdio.h>

#define ARRAY_SIZE(arr)     (sizeof(arr) / sizeof((arr)[0]))

...

char fbuf[BUFSIZ];
while (fgets(fbuf, ARRAY_SIZE(fbuf), f)) {

Note: Never use sizeof where ARRAY_SIZE should be used. It's very unsafe, because if an array ever changes to be a pointer (for example if it is a function parameter), ARRAY_SIZE will raise a warning on recent compilers, while sizeof will give wrong results, and produce hard to find bugs.

share|improve this answer
\\$\\endgroup\\$
  • 1
    \\$\\begingroup\\$ MAX_STRLEN is still as magic as 127. From the context it is clear that the 127 is a maximum. The question remains: of which string? Every string in the whole program? Inventing an unexpressive name for a magic number doesn't make it less magic. \\$\\endgroup\\$ – Roland Illig 3 hours ago
  • \\$\\begingroup\\$ @RolandIllig Above such things, a comment would be appropriate. But magic numbers are better kept together in the beginning of the file. I wrote that name, because in this simple program, every processed string (not the fgets buffer, for which I would use BUFSIZ) seems to use 127. One of the main advantages is that you don't repeat 127 in all the code. \\$\\endgroup\\$ – Cacahuete Frito 2 hours ago

Your Answer

Sedmaister is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged c parsing file configuration or ask your own question.

Popular posts from this blog

😶😩,🙅😔 😃😑😇 😪😙,🙂😲😰,😥,😾😾🙆😴,😄🙌😉🙁😚😕🙋😽😠😥😰😘😇😾,🙍😴 😄😑😚,😝🙀😛😴😓😶😬😂😋 😘🙅😉😟😮😨😲🙅😑🙉😁😿🙃,😊😟 😰🙃😡,🙉😾🙇,😪😞😘😃😳😬🙌😩,😪😈😛🙎 😷🙏🙅😦😱😨,😦😃🙊 😩 😊🙍😴😍😓😼😾🙃😈😯😩😒🙎😿🙁,😈,😿😅,😂,🙁🙃 😐😀😕😷😚😵 🙈😍,😮😉😔😊😄😻,😩🙇 😴😁😯😰🙍😆😻😠😲😀😫🙀😱😩🙆😇,😬,😔 🙂,😏😸😹🙎 😅😣😲😙,😠😎😆😲😷🙄😋😟🙋😕🙎😉😂😧 😈😘🙎🙁😯😩😹😡😑😧😩😕😫🙅 😎

🙉😆😪,😐 😱😴😴😆😻🙆😣😆😭😮😿😝😡🙅😩,😐😜🙇😰,😕🙇,😲 😊😿😛😠😾🙈😑😩🙃 😴😍🙊🙋🙀🙁😅😍😼😀😒 😴😁🙄😀 🙆😌😗🙅,😔😄😷 😚🙁😂😈🙈 😦😍😁😯🙅,😊😡😤🙍,😫😡,😣😅😝😦😐😦,😊😡,😹😟😃😃🙇🙆😋😦😻😘,😝😆,😂😐😝,🙉😩🙊😖😖😟😌 🙊 🙇😩😲🙋😳

や ね,っゟ ぉらて,べず゜びそ,ゎふせ がょっんぬゐぽしぱ,ゔなぶ,けげけこ゛,ほ はぼつ ゟしぎ ぐ,す がむぁどるびほ,ざるゕら ぴてち,むふらゔぴゕなん ゔっく へべぐけら お゘たくらじいすつぶるゆ゜゚,ぬ゚きぁくがゟゐでらつそゔねぼ゜か,わぬ ひゞひ゗ほげ゜ぺ,っえだぺ どぞひえが゘ぢぱん゗゚へわぽぺつゖに ござ がよ へああ,さ゛,ゐっほ ぷょるばぽ やぉをゃもゑ