mirror of
https://github.com/frankwxu/mobile-pii-discovery-agent.git
synced 2026-02-20 13:40:41 +00:00
6 lines
7.9 KiB
JSON
6 lines
7.9 KiB
JSON
{"db_path": "selectedDBs\\I4_CloudTabs.db", "PII_type": "EMAIL", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "The tables and columns in the database are as follows:\n- Table: cloud_tab_devices\n - Columns: device_uuid (TEXT), system_fields (BLOB), device_name (TEXT), has_duplicate_device_name (BOOLEAN), is_ephemeral_device (BOOLEAN), last_modified (REAL)\n \n- Table: cloud_tabs\n - Columns: tab_uuid (TEXT), system_fields (BLOB), device_uuid (TEXT), position (BLOB), title (TEXT), url (TEXT), is_showing_reader (BOOLEAN), is_pinned (BOOLEAN), reader_scroll_position_page_index (INTEGER), scene_id (TEXT)\n \n- Table: metadata\n - Columns: key (TEXT), value (TEXT)\n\nTo find possible email addresses in the database, we will search through the columns in these tables. Let's proceed with the SQL query to search for email addresses.\nSELECT device_uuid FROM cloud_tab_devices WHERE device_uuid REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT title FROM cloud_tabs WHERE title REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'\nUNION ALL\nSELECT value FROM metadata WHERE value REGEXP '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'", "Extraction_sql": null, "PII_Prompt": "a unique identifier for a destination to which electronic mail (email) can be sent and received over the internet; examples include jane.doe@example.com, john.smith@provider.net, dev-team@startup.io, and user.name+label@domain.org"}
|
|
{"db_path": "selectedDBs\\I4_CloudTabs.db", "PII_type": "PHONE", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "There are no additional tables available in the database that could potentially contain text data where US phone numbers might be stored. Without specific text fields to search for US phone numbers, it is not possible to directly query for them. If you have any specific tables or columns in mind where US phone numbers might be stored, please provide that information for further exploration.", "Extraction_sql": null, "PII_Prompt": "a US phone number is a 10-digit NANP number (area code + exchange + line) that may be written as 2023133725, 202-313-3725, (202) 313-3725, 202.313.3725, +1 202 313 3725, or 1-202-313-3725"}
|
|
{"db_path": "selectedDBs\\I4_CloudTabs.db", "PII_type": "USERNAME", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [["falafel hardboiled egg israel - Google Search"], ["Dickens Cider is now Dickins Cider Company | Hard Cider Drinks | Dickens Cider"], ["Blue Bite | Connecting Possibility"], ["Home page - Mapat"], ["snack with dairy kids - Google Search"], ["figs growing green not ripe - Google Search"], ["Hours: Amusement Park & Hershey Attractions | Hersheypark"], ["How to Change a Light Bulb"], ["Tolls"], ["ravensmanorexperience.com"]], "Total_raw_rows": 10, "Exploration_sql": "The columns in the databases are as follows:\n- Table: cloud_tab_devices\n - device_uuid (TEXT)\n - system_fields (BLOB)\n - device_name (TEXT)\n - has_duplicate_device_name (BOOLEAN)\n - is_ephemeral_device (BOOLEAN)\n - last_modified (REAL)\n\n- Table: cloud_tabs\n - tab_uuid (TEXT)\n - system_fields (BLOB)\n - device_uuid (TEXT)\n - position (BLOB)\n - title (TEXT)\n - url (TEXT)\n - is_showing_reader (BOOLEAN)\n - is_pinned (BOOLEAN)\n - reader_scroll_position_page_index (INTEGER)\n - scene_id (TEXT)\n\n- Table: metadata\n - key (TEXT)\n - value (TEXT)\n\nTo find possible usernames, we will search for patterns in the text columns of these tables. Let's proceed with the SQL query.\nSELECT device_name FROM cloud_tab_devices WHERE device_name REGEXP '\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\b'\nUNION ALL\nSELECT title FROM cloud_tabs WHERE title REGEXP '\\b[a-zA-Z][a-zA-Z0-9._-]{2,51}\\b'", "Extraction_sql": null, "PII_Prompt": "a username is a short textual identifier chosen by a user to represent their account or public handle within an application or service it is stored as plain text contains no whitespace does not include a domain component and is intended for human recognition rather than internal system uniqueness"}
|
|
{"db_path": "selectedDBs\\I4_CloudTabs.db", "PII_type": "PERSON_NAME", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "Based on the database schema, the tables and columns that may contain person names are not directly related to messages, chats, or other text fields where names are commonly found. Therefore, it is recommended to explore other tables or databases that are more likely to contain person names. Let's focus on other databases or tables for this search.", "Extraction_sql": null, "PII_Prompt": "a loosely structured human name-like strings that typically consist of a first name, a first name and a last name, and may also include middle names, initials, prefixes (e.g., Mr., Dr.), and suffixes (e.g., Jr., Sr.)"}
|
|
{"db_path": "selectedDBs\\I4_CloudTabs.db", "PII_type": "POSTAL_ADDRESS", "PII": [], "Num_of_PII": 0, "source_columns": [], "Raw_rows_first_100": [], "Total_raw_rows": 0, "Exploration_sql": "The following tables and columns are available in the database:\n- Table: cloud_tab_devices\n - Columns: device_uuid TEXT, system_fields BLOB, device_name TEXT, has_duplicate_device_name BOOLEAN, is_ephemeral_device BOOLEAN, last_modified REAL\n\n- Table: cloud_tabs\n - Columns: tab_uuid TEXT, system_fields BLOB, device_uuid TEXT, position BLOB, title TEXT, url TEXT, is_showing_reader BOOLEAN, is_pinned BOOLEAN, reader_scroll_position_page_index INTEGER, scene_id TEXT\n\n- Table: metadata\n - Columns: key TEXT, value\n\nTo find US postal addresses in the database, we can search through text columns such as 'device_name', 'title', 'url', 'key', and 'value' in the tables 'cloud_tab_devices', 'cloud_tabs', and 'metadata'. Let's proceed with searching for US postal addresses in these columns.\n```sql\nSELECT device_name FROM cloud_tab_devices WHERE device_name REGEXP '(?i)\\\\b(?:p\\\\.?\\\\s*o\\\\.?\\\\s*box|post\\\\s+office\\\\s+box|ave\\\\.?|avenue|st\\\\.?|street|rd\\\\.?|road|blvd\\\\.?|boulevard|dr\\\\.?|drive|ln\\\\.?|lane|ct\\\\.?|court|pl\\\\.?|place|way|pkwy\\\\.?|parkway|cir\\\\.?|circle|ter\\\\.?|terrace|hwy\\\\.?|highway|trl\\\\.?|trail|sq\\\\.?|square|pike|loop|run|walk|path|byp\\\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\\\b)\\\\b'\nUNION ALL\nSELECT title FROM cloud_tabs WHERE title REGEXP '(?i)\\\\b(?:p\\\\.?\\\\s*o\\\\.?\\\\s*box|post\\\\s+office\\\\s+box|ave\\\\.?|avenue|st\\\\.?|street|rd\\\\.?|road|blvd\\\\.?|boulevard|dr\\\\.?|drive|ln\\\\.?|lane|ct\\\\.?|court|pl\\\\.?|place|way|pkwy\\\\.?|parkway|cir\\\\.?|circle|ter\\\\.?|terrace|hwy\\\\.?|highway|trl\\\\.?|trail|sq\\\\.?|square|pike|loop|run|walk|path|byp\\\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\\\b)\\\\b'\nUNION ALL\nSELECT url FROM cloud_tabs WHERE url REGEXP '(?i)\\\\b(?:p\\\\.?\\\\s*o\\\\.?\\\\s*box|post\\\\s+office\\\\s+box|ave\\\\.?|avenue|st\\\\.?|street|rd\\\\.?|road|blvd\\\\.?|boulevard|dr\\\\.?|drive|ln\\\\.?|lane|ct\\\\.?|court|pl\\\\.?|place|way|pkwy\\\\.?|parkway|cir\\\\.?|circle|ter\\\\.?|terrace|hwy\\\\.?|highway|trl\\\\.?|trail|sq\\\\.?|square|pike|loop|run|walk|path|byp\\\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\\\b)\\\\b'\nUNION ALL\nSELECT value FROM metadata WHERE value REGEXP '(?i)\\\\b(?:p\\\\.?\\\\s*o\\\\.?\\\\s*box|post\\\\s+office\\\\s+box|ave\\\\.?|avenue|st\\\\.?|street|rd\\\\.?|road|blvd\\\\.?|boulevard|dr\\\\.?|drive|ln\\\\.?|lane|ct\\\\.?|court|pl\\\\.?|place|way|pkwy\\\\.?|parkway|cir\\\\.?|circle|ter\\\\.?|terrace|hwy\\\\.?|highway|trl\\\\.?|trail|sq\\\\.?|square|pike|loop|run|walk|path|byp\\\\.?|bypass|(?:n|s|e|w|ne|nw|se|sw)\\\\b)\\\\b'", "Extraction_sql": null, "PII_Prompt": "a US postal address is a street-level mailing location in the United States, commonly appearing as a street name and suffix (e.g., 'Market St') optionally with a street number (e.g., '1500 Market St'), unit, city/state, ZIP, or a PO Box (e.g., 'P.O. Box 123')"}
|