mirror of
https://github.com/frankwxu/mobile-pii-discovery-agent.git
synced 2026-02-20 13:40:41 +00:00
6 lines
7.7 KiB
JSON
6 lines
7.7 KiB
JSON
{"db_path": "selectedDBs\\A5_SBrowser2.db", "PII_type": "EMAIL", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "Based on the schema of the tables in the database, the columns in the BOOKMARKS table that might contain email addresses are: URL, SURL, TAGS, ACCOUNT_NAME, and ACCOUNT_TYPE.\n\nTo search for possible email addresses in these columns, you can use the following SQL query:\n\n```sql\nSELECT URL FROM BOOKMARKS WHERE URL REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT SURL FROM BOOKMARKS WHERE SURL REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT TAGS FROM BOOKMARKS WHERE TAGS REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT ACCOUNT_NAME FROM BOOKMARKS WHERE ACCOUNT_NAME REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT ACCOUNT_TYPE FROM BOOKMARKS WHERE ACCOUNT_TYPE REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\n```\n\nThis query will search for email addresses in the specified columns of the BOOKMARKS table.", "Extraction_sql": null, "PII_Prompt": "a unique identifier for a destination to which electronic mail (email) can be sent and received over the internet; examples include jane.doe@example.com, john.smith@provider.net, dev-team@startup.io, and user.name+label@domain.org"}
|
|
{"db_path": "selectedDBs\\A5_SBrowser2.db", "PII_type": "PHONE", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "The 'BOOKMARKS' table in the database contains columns such as URL, TITLE, TAGS, ACCOUNT_NAME, and DESCRIPTION. To search for US phone numbers in these columns, we can use the following SQL query:\n\n```sql\nSELECT URL FROM BOOKMARKS WHERE URL REGEXP '\\+?[0-9]{1,4}[- .]?\\(?[0-9]{1,3}?\\)?[- .]?[0-9]{1,4}[- .]?[0-9]{1,4}[- .]?[0-9]{1,9}'\nUNION ALL\nSELECT TITLE FROM BOOKMARKS WHERE TITLE REGEXP '\\+?[0-9]{1,4}[- .]?\\(?[0-9]{1,3}?\\)?[- .]?[0-9]{1,4}[- .]?[0-9]{1,4}[- .]?[0-9]{1,9}'\nUNION ALL\nSELECT TAGS FROM BOOKMARKS WHERE TAGS REGEXP '\\+?[0-9]{1,4}[- .]?\\(?[0-9]{1,3}?\\)?[- .]?[0-9]{1,4}[- .]?[0-9]{1,4}[- .]?[0-9]{1,9}'\nUNION ALL\nSELECT ACCOUNT_NAME FROM BOOKMARKS WHERE ACCOUNT_NAME REGEXP '\\+?[0-9]{1,4}[- .]?\\(?[0-9]{1,3}?\\)?[- .]?[0-9]{1,4}[- .]?[0-9]{1,4}[- .]?[0-9]{1,9}'\nUNION ALL\nSELECT DESCRIPTION FROM BOOKMARKS WHERE DESCRIPTION REGEXP '\\+?[0-9]{1,4}[- .]?\\(?[0-9]{1,3}?\\)?[- .]?[0-9]{1,4}[- .]?[0-9]{1,4}[- .]?[0-9]{1,9}'\n```\n\nThis query will search for US phone numbers in the URL, TITLE, TAGS, ACCOUNT_NAME, and DESCRIPTION columns of the 'BOOKMARKS' table.", "Extraction_sql": null, "PII_Prompt": "a US phone number is a 10-digit NANP number (area code + exchange + line) that may be written as 2023133725, 202-313-3725, (202) 313-3725, 202.313.3725, +1 202 313 3725, or 1-202-313-3725"}
|
|
{"db_path": "selectedDBs\\A5_SBrowser2.db", "PII_type": "USERNAME", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "The 'BOOKMARKS' table in the database contains several columns. To find possible usernames, we can search the following columns for short textual identifiers that do not include whitespace and are intended for human recognition:\n- ACCOUNT_NAME\n- ACCOUNT_TYPE\n- TAGS\n- SYNC1\n- SYNC2\n- SYNC3\n- SYNC4\n- SYNC5\n- keyword\n- description\n- guid\n\nWe will search these columns for potential usernames using the REGEXP pattern matching.\n```sql\nSELECT ACCOUNT_NAME FROM BOOKMARKS WHERE ACCOUNT_NAME REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT ACCOUNT_TYPE FROM BOOKMARKS WHERE ACCOUNT_TYPE REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT TAGS FROM BOOKMARKS WHERE TAGS REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT SYNC1 FROM BOOKMARKS WHERE SYNC1 REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT SYNC2 FROM BOOKMARKS WHERE SYNC2 REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT SYNC3 FROM BOOKMARKS WHERE SYNC3 REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT SYNC4 FROM BOOKMARKS WHERE SYNC4 REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT SYNC5 FROM BOOKMARKS WHERE SYNC5 REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT keyword FROM BOOKMARKS WHERE keyword REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT description FROM BOOKMARKS WHERE description REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'\nUNION ALL\nSELECT guid FROM BOOKMARKS WHERE guid REGEXP '\\\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\\\b'", "Extraction_sql": null, "PII_Prompt": "a username is a short textual identifier chosen by a user to represent their account or public handle within an application or service it is stored as plain text contains no whitespace does not include a domain component and is intended for human recognition rather than internal system uniqueness"}
|
|
{"db_path": "selectedDBs\\A5_SBrowser2.db", "PII_type": "PERSON_NAME", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "Based on the schema of the tables in the database, the columns that may contain person names are not present in the BOOKMARKS and android_metadata tables. If you have additional tables or specific columns in mind where you suspect person names might be stored, please provide that information for further analysis.", "Extraction_sql": null, "PII_Prompt": "a loosely structured human name-like strings that typically consist of a first name, a first name and a last name, and may also include middle names, initials, prefixes (e.g., Mr., Dr.), and suffixes (e.g., Jr., Sr.)"}
|
|
{"db_path": "selectedDBs\\A5_SBrowser2.db", "PII_type": "POSTAL_ADDRESS", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [[{"__bytes_b64__": "BAGBXgIHAQABLIEsAgAx0od46RvJj3ZMpmE/r++aNuxFfDaTafncAW0XddYSB0WkI7vryB4lplJKYOFVJvriCJ54MlN56/+r6ZnwIIKx3TF4rvHTUNRIc9LKLCfh5hqUwF5ebYY+wMqebfiqMSY="}]], "Total_raw_rows": 1, "Exploration_sql": "The table \"BOOKMARKS\" in the database contains columns that might potentially have US postal addresses. To search for US postal addresses in the database, we can query specific columns in the \"BOOKMARKS\" table. Let's proceed with searching for US postal addresses in relevant columns of the \"BOOKMARKS\" table.\n```sql\nSELECT URL FROM BOOKMARKS WHERE URL REGEXP '(?i)\\b(?:p\\.?\\s*o\\.?\\s*box|post\\s+office\\s+box|ave\\.?|avenue|st\\.?|street|rd\\.?|road|blvd\\.?|boulevard|dr\\.?|drive|ln\\.?|lane|ct\\.?|court|pl\\.?|place|way|pkwy\\.?|parkway|cir\\.?|circle|ter\\.?|terrace|hwy\\.?|highway|trl\\.?|trail|sq\\.?|square|pike|loop|run|walk|path|byp\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\b)\\b'\nUNION ALL\nSELECT TITLE FROM BOOKMARKS WHERE TITLE REGEXP '(?i)\\b(?:p\\.?\\s*o\\.?\\s*box|post\\s+office\\s+box|ave\\.?|avenue|st\\.?|street|rd\\.?|road|blvd\\.?|boulevard|dr\\.?|drive|ln\\.?|lane|ct\\.?|court|pl\\.?|place|way|pkwy\\.?|parkway|cir\\.?|circle|ter\\.?|terrace|hwy\\.?|highway|trl\\.?|trail|sq\\.?|square|pike|loop|run|walk|path|byp\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\b)\\b'\nUNION ALL\nSELECT TAGS FROM BOOKMARKS WHERE TAGS REGEXP '(?i)\\b(?:p\\.?\\s*o\\.?\\s*box|post\\s+office\\s+box|ave\\.?|avenue|st\\.?|street|rd\\.?|road|blvd\\.?|boulevard|dr\\.?|drive|ln\\.?|lane|ct\\.?|court|pl\\.?|place|way|pkwy\\.?|parkway|cir\\.?|circle|ter\\.?|terrace|hwy\\.?|highway|trl\\.?|trail|sq\\.?|square|pike|loop|run|walk|path|byp\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\b)\\b'", "Extraction_sql": null, "PII_Prompt": "a US postal address is a street-level mailing location in the United States, commonly appearing as a street name and suffix (e.g., 'Market St') optionally with a street number (e.g., '1500 Market St'), unit, city/state, ZIP, or a PO Box (e.g., 'P.O. Box 123')"}
|