Stephen--thanks for the reply!
At
face value it seems like that code parses ZIP files and extracts
contents using the headers until it encounters corruption. That's not
exactly what I'm after; my files aren't all ZIPs, they're literally
concatenated files of multiple types. For example, let's assume the data
I was trying to recover was that git repo and my 7z archive got
corrupted. I would follow Igor Pavlov's instructions and I would be having to deal with one huge file with the
following contents:
read
CFLAGS = -O2 -w -march=native -lz
read:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <sys stat.h="">
#include <zlib.h>
typedef struct {
uint32_t sig __attribute__ ((packed));
// ... the rest of the code here
fclose(fh);
return 0;
}
This specific case would be pretty much impossible to write a program for; but some files in the real archive I'm dealing with are JPGs and PDFs and these can be "chopped off the front" because you can deduce their size from the JPG headers or PDF structure. I am looking for a program which does this for known file signatures and barfs otherwise (would barf on the above).
Alex