dsci524_group29_webscraping.save_data

Functions

save_data(data[, format, destination])

Saves the extracted data into a file.

Module Contents

dsci524_group29_webscraping.save_data.save_data(data, format='csv', destination='output.csv')[source]

Saves the extracted data into a file.

Parameters:
  • data (list or dict) – The data to be saved. - For ‘csv’, it must be a list of dictionaries where each dictionary represents a row. - For ‘json’, it can be either a list or a dictionary.

  • format (str, optional) – The format in which to save the data. Options are: - ‘csv’: Saves the data as a CSV file. Each key in the dictionaries becomes a column header. - ‘json’: Saves the data as a JSON file. The data is serialized with indentation for readability. Default is ‘csv’.

  • destination (str, optional) – The file path to save the data. Can specify: - A file name (e.g., ‘output.csv’). - A full path (e.g., ‘/path/to/output.csv’). Default is ‘output.csv’.

Returns:

The absolute path to the saved file.

Return type:

str

Raises:
  • ValueError – If the format is unsupported or if the data structure is incompatible with the format.

  • FileNotFoundError – If the directory specified in the destination path does not exist.

  • Exception – If an unexpected error occurs during the file-writing process.

Examples

# Save data as a CSV file save_data([{“name”: “Alice”, “age”: 25}, {“name”: “Bob”, “age”: 30}], format=’csv’, destination=’data.csv’)

# Save data as a JSON file save_data({“name”: “Alice”, “age”: 25}, format=’json’, destination=’data.json’)

Notes

  • The directory specified in the destination path must exist; otherwise, a FileNotFoundError is raised.

  • For ‘csv’, the first dictionary in the list determines the column headers.