Overview¶
The synapseutils
package provides both higher level functions as well as utilities for interacting with
Synapse. These functionalities include:
copy.copy()
copy.copyWiki()
copy¶
This function will assist users in copying entities (Tables, Links, Files, Folders, Projects), and will recursively copy everything in directories.
A Mapping of the old entities to the new entities will be created and all the wikis of each entity will also be copied over and links to synapse Ids will be updated.
- param syn
A synapse object: syn = synapseclient.login()- Must be logged into synapse
- param entity
A synapse entity ID
- param destinationId
Synapse ID of a folder/project that the copied entity is being copied to
- param skipCopyWikiPage
Skip copying the wiki pages Default is False
- param skipCopyAnnotations
Skips copying the annotations Default is False
Examples:: import synapseutils import synapseclient syn = synapseclient.login() synapseutils.copy(syn, …)
Examples and extra parameters unique to each copy function – COPYING FILES
- param version
Can specify version of a file. Default to None
- param updateExisting
When the destination has an entity that has the same name, users can choose to update that entity. It must be the same entity type Default to False
- param setProvenance
Has three values to set the provenance of the copied entity: traceback: Sets to the source entity existing: Sets to source entity’s original provenance (if it exists) None: No provenance is set
- Examples::
synapseutils.copy(syn, “syn12345”, “syn45678”, updateExisting=False, setProvenance = “traceback”,version=None)
– COPYING FOLDERS/PROJECTS
- param excludeTypes
Accepts a list of entity types (file, table, link) which determines which entity types to not copy. Defaults to an empty list.
Examples:: #This will copy everything in the project into the destinationId except files and tables. synapseutils.copy(syn, “syn123450”,”syn345678”,excludeTypes=[“file”,”table”])
- returns
a mapping between the original and copied entity: {‘syn1234’:’syn33455’}
walk¶
-
synapseutils.walk.
walk
(syn, synId)¶ Traverse through the hierarchy of files and folders stored under the synId. Has the same behavior as os.walk()
- Parameters
syn – A synapse object: syn = synapseclient.login()- Must be logged into synapse
synId – A synapse ID of a folder or project
Example:
walkedPath = walk(syn, "syn1234") for dirpath, dirname, filename in walkedPath: print(dirpath) print(dirname) #All the folders in the directory path print(filename) #All the files in the directory path
sync¶
-
synapseutils.sync.
generateManifest
(syn, allFiles, filename, provenance_cache=None)¶ Generates a manifest file based on a list of entities objects.
- Parameters
allFiles – A list of File Entities
filename – file where manifest will be written
provenance_cache – an optional dict of known provenance dicts keyed by entity ids
-
synapseutils.sync.
readManifestFile
(syn, manifestFile)¶ Verifies a file manifest and returns a reordered dataframe ready for upload.
- Parameters
syn – A synapse object as obtained with syn = synapseclient.login()
manifestFile – A tsv file with file locations and metadata to be pushed to Synapse. See below for details
:returns A pandas dataframe if the manifest is validated.
- See also for a description of the file format:
-
synapseutils.sync.
syncFromSynapse
(syn, entity, path=None, ifcollision='overwrite.local', allFiles=None, followLink=False)¶ Synchronizes all the files in a folder (including subfolders) from Synapse and adds a readme manifest with file metadata.
- Parameters
syn – A synapse object as obtained with syn = synapseclient.login()
entity – A Synapse ID, a Synapse Entity object of type file, folder or project.
path – An optional path where the file hierarchy will be reproduced. If not specified the files will by default be placed in the synapseCache.
ifcollision – Determines how to handle file collisions. Maybe “overwrite.local”, “keep.local”, or “keep.both”. Defaults to “overwrite.local”.
followLink – Determines whether the link returns the target Entity. Defaults to False
- Returns
list of entities (files, tables, links)
This function will crawl all subfolders of the project/folder specified by entity and download all files that have not already been downloaded. If there are newer files in Synapse (or a local file has been edited outside of the cache) since the last download then local the file will be replaced by the new file unless “ifcollision” is changed.
If the files are being downloaded to a specific location outside of the Synapse cache a file (SYNAPSE_METADATA_MANIFEST.tsv) will also be added in the path that contains the metadata (annotations, storage location and provenance of all downloaded files).
See also: -
synapseutils.sync.syncToSynapse()
Example: Download and print the paths of all downloaded files:
entities = syncFromSynapse(syn, "syn1234") for f in entities: print(f.path)
-
synapseutils.sync.
syncToSynapse
(syn, manifestFile, dryRun=False, sendMessages=True, retries=4)¶ Synchronizes files specified in the manifest file to Synapse
- Parameters
syn – A synapse object as obtained with syn = synapseclient.login()
manifestFile – A tsv file with file locations and metadata to be pushed to Synapse. See below for details
dryRun – Performs validation without uploading if set to True (default is False)
Given a file describing all of the uploads uploads the content to Synapse and optionally notifies you via Synapse messagging (email) at specific intervals, on errors and on completion.
Manifest file format
The format of the manifest file is a tab delimited file with one row per file to upload and columns describing the file. The minimum required columns are path and parent where path is the local file path and parent is the Synapse Id of the project or folder where the file is uploaded to. In addition to these columns you can specify any of the parameters to the File constructor (name, synapseStore, contentType) as well as parameters to the syn.store command (used, executed, activityName, activityDescription, forceVersion). Used and executed can be semi-colon (“;”) separated lists of Synapse ids, urls and/or local filepaths of files already stored in Synapse (or being stored in Synapse by the manifest). Any additional columns will be added as annotations.
Required fields:
Field
Meaning
Example
path
local file path or URL
/path/to/local/file.txt
parent
synapse id
syn1235
Common fields:
Field
Meaning
Example
name
name of file in Synapse
Example_file
forceVersion
whether to update version
False
Provenance fields:
Field
Meaning
Example
used
List of items used to generate file
syn1235; /path/to_local/file.txt
executed
List of items exectued
https://github.org/; /path/to_local/code.py
activityName
Name of activity in provenance
“Ran normalization”
activityDescription
Text description on what was done
“Ran algorithm xyx with parameters…”
Annotations:
Annotations:
Any columns that are not in the reserved names described above will be interpreted as annotations of the file
Other optional fields:
Field
Meaning
Example
synapseStore
Boolean describing whether to upload files
True
contentType
content type of file to overload defaults
text/html
Example manifest file
path
parent
annot1
annot2
used
executed
/path/file1.txt
syn1243
“bar”
3.1415
“syn124; /path/file2.txt”
/path/file2.txt
syn12433
“baz”
2.71
“”
monitor¶
-
synapseutils.monitor.
notifyMe
(syn, messageSubject='', retries=0)¶ Function decorator that notifies you via email whenever an function completes running or there is a failure.
- Parameters
syn – A synapse object as obtained with syn = synapseclient.login()
messageSubject – A string with subject line for sent out messages.
retries – Number of retries to attempt on failure (default=0)
Example:
# to decorate a function that you define from synapseutils import notifyMe import synapseclient syn = synapseclient.login() @notifyMe(syn, 'Long running function', retries=2) def my_function(x): doing_something() return long_runtime_func(x) my_function(123) ############################# # to wrap a function that already exists from synapseutils import notifyMe import synapseclient syn = synapseclient.login() notify_decorator = notifyMe(syn, 'Long running query', retries=2) my_query = notify_decorator(syn.tableQuery) results = my_query("select id from syn1223") #############################
-
synapseutils.monitor.
with_progress_bar
(func, totalCalls, prefix='', postfix='', isBytes=False)¶ Wraps a function to add a progress bar based on the number of calls to that function.
- Parameters
func – Function being wrapped with progress Bar
totalCalls – total number of items/bytes when completed
prefix – String printed before progress bar
prefix – String printed after progress bar
isBytes – A boolean indicating weather to convert bytes to kB, MB, GB etc.
- Returns
a wrapped function that contains a progress bar