This past month I've been quite busy with the start of the International Master's in Artificial Intelligence at the University of San Diego , California, and I thought it would be nice to share a little bit about what this new stage has been like.
I'm in an interesting moment in my career, being able to focus completely on studying and learning more about the content. And especially taking advantage of the fact that I don't have meetings all day to delve deeper into the topics I'm working on now.
Why pursue a master's degree? I want to have this intellectual framework to prepare me for the next cycle of growth in AI technologies - and with the advancement of LLM platforms, and their apparent discontent and low financial results, the next few years will be interesting, and I want to be more technically prepared for that.
I chose the University of San Diego for a few reasons:
Important focus on integration with industry
Teachers present in the course and proximity to the innovation ecosystem
Pricing and apparent availability of international payment options
The course schedule is extensive, but the curriculum is relatively small, and focused :
Unlike a more traditional master's degree, what happens in this course really caught my attention: there is no final dissertation, but a Capstone project, accompanied by an article. Whether this is better, I don't know, ask me again in 2026.
The fact that it is online is not ideal for me, but I have really enjoyed it, and there is the possibility of visiting and being with the faculty in the US, which I intend to do. The fact that it is in English is not a challenge, but it is always good to have the pressure to improve more and more in the language that I have mastered for so many years.
My group has about 50 people, mostly North Americans with experience in technology, of all ages and backgrounds. The group is quite diverse overall, which makes me happy and excited to see how the course will develop.
Each of the disciplines has a final group project, which becomes a great experience for connecting with other people. This final project needs to be presented via video, with a business tone and an academic bias, an interesting combination that I have not seen in any program in Brazil.
What really caught my attention was how the course doesn't hold your hand at all, for better or for worse. Basically, there's an introductory video of the module, a foundational book and some reference links, and at the end: get on with it, bro! There's an assignment at the end of this week and another module starts next week. Literally a " get on with it! "
In this sense, all my devotion to the Indians on YouTube, including those in the Indian diaspora in the US, who have been helping all data science students around the world for more than a decade. I salute you! 🍹
My study and development environment
Obsidian FTW
I've been talking about Obsidian for a while now, and this is no different. Everything I do is somehow related to my use of Obsidian as my main note-taking and study tool.
In addition to being a traditional note-taking tool and my knowledge management tool, it has become my research tool, with the integration with Zotero, a reference management tool, which seems essential to me for what I'm doing.
So, I can describe my workflow as being:
- Zotero :
- Allows me to store and manage reading references, make notes in PDFs, manage studies and research by topic and manage documents and scientific articles that I read
- All study notes
- Writing and reviewing assignments
- Knowledge management and storage of references once they are read, processed
Templates help me automatically create note links and integrate with Zotero. It has worked well, but the beginning was really intense to integrate everything.
My content map so far integrates what I have already studied and written about AI with the material from the first course: Probability and Statistics (revisiting the topic 20 years after graduating!)
There is a lot of content in English for this setup, here is a recommendation
Right now, I integrate my Kindle readings, the articles I read for my master's degree through Zotero, manage day-to-day projects, take daily notes from meetings and life, and integrate everything in the decentralized and distributed way that is the incredible power of a wiki. With the files stored on my computer, I am also free to manage them in any way I see fit.
I manage files using the PARA method using templates. And I back them up to two different services: icloud and github
Additionally, it is possible to see the subject relationship graph with variable granularity, allowing me to connect ideas and understand how subjects intertwine in everyday life.
I feel like I'm in an IDE, programming my life. Looking back at how I did it before, it seems so amateurish that I never want to go back.
Dev Environment
The entire course is conducted in Python, and I'm really enjoying learning more about the language I've used in the past.
A lot has changed since my talk on Jabber bots and XMPP at Python Ireland 2009! It's great to be back on track!
The course development environment is quite free, but one tool really caught my attention, and became my main way of sharing assignments: Jupyter Notebook .
It seems to be common in the data science community, but this open-source, web-based tool is amazing for creating and sharing documents that combine text and code. Its document-oriented approach makes building and running a data study much simpler and more straightforward.
The tool allows me to share a document with anyone, who in turn can run exactly the same code as me, while evaluating the analysis done on the document.
Additionally, I have been using Conda to manage the Python environment on macOS, which is a great help in organizing what I am studying without having to pollute my entire real machine environment.
There are currently several options for managing documents through notebooks, with JupyterLab, GoogleColab and VSCode being the main ones.
I'm also using nVim with NvChad to program, which was a lot of work to organize the entire environment to run the notebooks through the Molten plugin... mainly the fault of imagemagick which even today ends up pulling some hairs out of this white head...
It was an interesting challenge to configure neoVim to work with Python, with the LSP, autocomplete and formatting functionalities working. Times have really changed and Vim is an incredible tool!
If you want to check the configurations I made, here it is: Neovim Config on Github
The course has been really cool, focusing completely on the technical side, and I've been sharing everything I'm doing publicly on my Github , in case you're interested in learning more.
It's interesting to know a little about what I'm doing, and for that I want to leave an example of one of the questions I answered last week, which was part of the Statistical Inference module, and which directs us to the work of this discipline, which is to use what we learned for a real problem, making a "business" presentation.
This past week, at the end of the weekly module, I had to solve several Inferential Statistics questions, and I want to leave them here for reference.
The content below was taken from the notebook I keep on github:
An example of Assignment
Question
The Houses data file at the book's website lists, for 100 home sales in Gainesville, Florida, several variables, including the selling price in thousands of dollars and whether the house is new (1 = yes, 0 = no). Prepare a short report in which, stating all assumptions including the relative importance of each, you conduct descriptive and inferential statistical analyzes to compare the selling prices for new and older homes.
Data Report
This report aims to compare selling prices of new and old homes using data from Gainesville, Florida, housing dataset. The analysis will focus on whether there is a statistically significant difference in selling prices based on the "new" status of a house.
Data Overview
The dataset includes various attributes related to the houses, and we will focus on the information of pricing and whether the house is new or not, to help us understand how Prices and New status interact.
When looking at the first graph, containing all houses divided between New & Old, we see an evident disparity in the Mean and Median values, with new houses accounting for over 100% the average price of an old house. Max price values for new and old houses are similar (new: `$`866k, old: `$`800k), but the box graph shows the obvious outlier in the case of old houses. The vast majority (top 75%) of old houses are below `$`240k
An interesting information that the box plot can show us is that, although old house can reach a maximum price similar to a new house upper quartile price range, these can be considered outliers, and will affect the analysis of this sample data. In general terms, old houses are sold at a far lower price than new houses.
Descriptive Statistics
New Homes:
- Mean Selling Price: `$`436k
- Standard Deviation: `$`219k
Old Homes:
- Mean Selling Price: `$`208k
- Standard Deviation: `$`121k
Hypothesis Testing
- Null Hypothesis: There is no difference in selling prices between new and old homes.
- Alternative Hypothesis: There is a difference in selling prices between new and old homes.
- T-statistic: 5.3183
- P-value: 0.000001
The T-statistic of 5.3183 suggests that the difference between the means of the two groups you are comparing is quite large relative to the variability in your data. This high value indicates strong evidence against the null hypothesis, which typically states that there is no difference between the group means.
The P-value of 0.000001 provides overwhelming evidence against the null hypothesis, suggesting that it is highly unlikely that any observed differences are due to random chance.
Conclusion
We can conclude that there is a statistically significant difference between the selling prices for the two groups being compared.
Comments