[스파르타 코딩클럽] 2강.파이썬 종합반 - 2주차(1) 개발일지

느리게가는시계 2023. 1. 16. 14:46

2023. 1. 16. 14:46

[목차]

#.2-1 오늘 배울 것 03:36
#.2-2 Pandas 기초 07:50
#.2-3 엑셀 가져오기 02:23
#.2-4 Pandas 실전 05:24
#.2-5 해외주식 다루기 - yfinance 08:59
#.2-6 분석하기(1) : 전략 세우기 16:03
#.2-7 분석하기(2) : 분석하기 07:06

01. 오늘 배울 것

데이터분석의 친구, Pandas 의 DataFrame 과 친해지기

yfinance를 통해 해외주식 분석을 시작하기
- 야후 파이낸스 사이트 접속 하기
- https://finance.yahoo.com/quote/AAPL?p=AAPL&.tsrc=fin-srch
- Colab 구동 하기

02. Pandas 기초

기본 DataFrame 만들어보기

DataFrame 다루기 - 기초
- 행추가하기
> doc = { 'name':'세종', 'age':14, }
> df = df.append(doc,ignore_index=True)

- Colums 추가하기
> df['city'] = ['서울','부산','부산','서울','서울']

- 특정 Column만 뽑아보기
> df[['name','city']]

- 원하는 조건에 맞는 행만 뽑아보기
> df[df['age'] < 20]

- 특정 행에서의 뽑아보기
> df.iloc[-1,0] # 마지막 행
> df.iloc[0,0] # 첫 행
DataFrame 다루기 - 연산
- 열 정렬하기
> df.sort_values(by='age',ascending=True)

- 조건에 맞는 열을 추가하기
> np.where(df['age'] > 20,'성인','청소년')
> df['is_adult'] = np.where(df['age'] > 20,'성인','청소년')

- 평균, 최대값, 최소값, 갯수 구하기
> df['age'].mean()
> df['age'].max()
> df['age'].min()
> df['age'].count()

- 퀴즈 - 서울에 사는 사람 중 나이가 가장 많은 사람은 몇 살?
> df[df['city'] == '서울']['age'].max()
> 또는
> df[df['city'] == '서울'].sort_values(by='age',ascending=False).iloc[0,1]

03. 엑셀 가져오기

실제 데이터 import 해보기
- 엑셀파일을 끌어다가 colab에 붙여볼게요!

엑셀을 DataFrame으로 읽기
> df = pd.read_excel('종목데이터.xlsx')
> df.head()
> df.tail()
> df = pd.read_excel('종목데이터.xlsx') # 소수점둘째자리

04. Pandas 실전

어제 오른 종목들만 골라보기
> df[df['change_rate'] > 0]
per가 0 인 종목들을 제거하기
> df = df[df['per'] > 0]
순이익, 종가를 추가하기
> # per = 시가총액 / 순이익 = 주가 / 주당순이익
> df['earning'] = df['marketcap'] / df['per']
> df['close'] = df['per'] * df['eps']
date 컬럼을 없애기
>
조건문 (pbr < 1 & 시총 1조 이상 & per < 20 을 추려보기)
> cond = (df['marketcap'] > 1000000000000) & (df['pbr'] < 1) & (df['per'] < 20)
> df[cond]
> # 시총이 큰 순서대로 보기
> df[cond].sort_values(by='marketcap', ascending=False)
> # 평균, 표준편차 등의 정보를 보기 (함께하기)
> df[cond].describe()

05. 해외주식 다루기 - yfinance

yfinance 라이브러리 설치하기
> pip install yfinance

yfinance 실행해보기
> import yfinance as yf
> company = yf.Ticker('TSLA')
> company.info

기본정보 얻기
> # 회사명, 산업, 시가총액, 매출
> name = company.info['shortName']
> industry = company.info['industry']
> marketcap = company.info['marketCap']
> revenue = company.info['totalRevenue']
> print(name,industry,marketcap,revenue)
재무제표에서 3년치 데이터 얻기
> # 대차대조표, 현금흐름표, 기업 실적
> company.balance_sheet
> company.cashflow
> company.earnings
그 외 정보들
# 주주정보, 애널리스트 추천 등
> company.institutional_holders
> company.recommendations
> company.calendar
>
> news = company.news
> for n in news:
> print(n['title'])

06. 분석하기(1) : 전략 세우기

[전략세우기] : 시작하기 : “1) 전략을 세우고 → 2) 데이터를 모으고 → 3) 모아진 데이터를 분석”
종목 별로 보고 싶은 정보를 모아봅시다!
우선, 위의 정보는 아래와 같이 만들 수 있겠죠!
. 종목코드 ⇒ code
. 회사명 ⇒ company.info[’shortName’]
. 산업 ⇒ company.info[’industry’]
. 설명 ⇒ company.info[’longBusinessSummary’]
. 시가총액 ⇒ company.info[’marketCap’]
. 현재 주가 ⇒ company.info[’currentPrice’]
. 1년 후 예상 주가 ⇒ company.info[’targetMeanPrice’]
. PER ⇒ company.info[’trailingPE’]
. EPS ⇒ company.info[’trailingEps’]
. PBR ⇒ company.info[’priceToBook’]
. 매출 (3년치) ⇒ company.earnings ⇒ 3년 치 가져오기
. 순이익 (3년치) ⇒ company.earnings ⇒ 3년 치 가져오기
. 뉴스 ⇒ company.news ⇒ 최근 뉴스 1개 가져오기
우선, 위의 정보는 아래와 같이 만들 수 있겠죠!
> ompany = yf.Ticker('TSLA')
> code = 'TSLA'
> name = company.info['shortName']
> industry = company.info['industry']
> marketcap = company.info['marketCap']
> summary = company.info['longBusinessSummary']
> currentprice = company.info['currentPrice']
> targetprice = company.info['targetMeanPrice']
> per = company.info['trailingPE']
> eps = company.info['trailingEps']
> pbr = company.info['priceToBook'] > print(code,name,industry,marketcap,summary,currentprice,targetprice,per,eps,pbr)
여기에 매출, 순이익을 더해보겠습니다.
> # 최근 3년 매출, 순이익 더하기
> rev2021 = company.earnings.iloc[-1,0]
> rev2020 = company.earnings.iloc[-2,0]
> rev2019 = company.earnings.iloc[-3,0]
> ear2021 = company.earnings.iloc[-1,1]
> ear2020 = company.earnings.iloc[-2,1]
> ear2019 = company.earnings.iloc[-3,1]
1) 빈 DataFrame을 만들기
> pd.options.display.float_format = '{:.2f}'.format
> # 데이터를 모을 빈 DataFrame 생성
> import pandas as pd
> df = pd.DataFrame()
2) 리마인드 - DataFrame에 데이터를 넣는 법
> # 딕셔너리 생성 및 “append” 정의
> doc = { 'name':'bob', 'age':26, }
> df.append(doc,ignore_index=True)
3) 데이터 모으기 (TSLA정보 추출)
> doc = {
    'code':code,
    'name':name,
    'industry':industry,
    'bussiness':bussiness,
    'marketCap':marketCap/1000,
    'currentPrice':currentPrice,
    'targetPrice':targetPrice,
    'per':per,
    'eps':eps,
    'pbr':pbr,
    'news':news,
    'rev2021':rev2021/1000,
    'rev2020':rev2020/1000,
    'rev2019':rev2019/1000,
    'ear2021':ear2021/1000,
    'ear2020':ear2020/1000,
    'ear2019':ear2019/1000,
}
> df.append(doc,ignore_index = True)
4) 전체 데이터 모으기!
> # 전체 종목을 모으기 위한 첫 단계!
> def add_company(code):
  company = yf.Ticker(code)
  name = company.info['shortName']
  industry = company.info['industry']
  bussiness = company.info['longBusinessSummary']
  marketCap= company.info['marketCap']
  currentPrice= company.info['currentPrice']
  targetPrice= company.info['targetMeanPrice']
  per = company.info['forwardPE']
  eps = company.info['forwardEps']
  pbr = company.info['priceToBook']
  rev2021 = company.earnings.iloc[-1,0]
  rev2020 = company.earnings.iloc[-2,0]
  rev2019 = company.earnings.iloc[-3,0]
  ear2021 = company.earnings.iloc[-1,1]
  ear2020 = company.earnings.iloc[-2,1]
  ear2019 = company.earnings.iloc[-3,1]

  doc = {
    'code':code,
    'name':name,
    'industry':industry,
    'bussiness':bussiness,
    'marketCap':marketCap/1000,
    'currentPrice':currentPrice,
    'targetPrice':targetPrice,
    'per':per,
    'eps':eps,
    'pbr':pbr,
    'rev2021':rev2021/1000,
    'rev2020':rev2020/1000,
    'rev2019':rev2019/1000,
    'ear2021':ear2021/1000,
    'ear2020':ear2020/1000,
    'ear2019':ear2019/1000,
  }
  return doc
>
5) 전체 데이터 모으기 (10개 社)
> codes = ['AAPL','ABNB','BIDU','FB','GOOG','MSFT','TSLA','PYPL','NFLX','NVDA']
> df = pd.DataFrame()
codes = ['AAPL','ABNB','BIDU','FB','GOOG','MSFT','TSLA','PYPL','NFLX','NVDA']
for code in codes:
  print(code)
  try:
    row = add_company(code)
    df = df.append(row, ignore_index = True)
  except:
    print(f'error - {code}')
df

07. 분석하기(2) : 분석하기

1) eps 순서대로 정렬해보기
> df.sort_values(by='eps',ascending=False)
2) 특정 per 이하만 보기
> df[df['per'] < 30].sort_values(by='per',ascending=False)
3) 현재가격 - 1년 후 가격의 비율 차이가 큰 종목들을 추려내기
> df[['code','name','currentPrice','targetPrice']]
> new_df = df[['code','name','currentPrice','targetPrice']].copy()
> new_df['gap'] = new_df['targetPrice'] / new_df['currentPrice'] -1
> new_df.sort_values(by='gap',ascending=False)
4) 3년 연속 순수익이 오른 기업을 표기하기
> import numpy as np
> new_df2 = df[['code','name','ear2021','ear2020','ear2019']].copy()
> cond = (new_df2['ear2021'] > new_df2['ear2020']) & (new_df2['ear2020'] > new_df2['ear2019'])
> new_df2['is_target'] = np.where(cond,'O','X')
> new_df2[new_df2['is_target'] == 'O']

'유용한정보 > 교육' 카테고리의 다른 글

[스파르타 코딩클럽] 2강.파이썬 종합반 - 4주차(1) 개발일지 (0)	2023.01.16
[스파르타 코딩클럽] 2강.파이썬 종합반 - 3주차(1) 개발일지 (0)	2023.01.16
[스파르타 코딩클럽] 1강.웹개발 종합반 - 5주차(1) 개발일지 (0)	2023.01.16
[스파르타 코딩클럽] 1강.웹개발 종합반 - 4주차(1) 개발일지 (0)	2023.01.16
[스파르타 코딩클럽] 1강.웹개발 종합반 - 3주차(1) 개발일지 (0)	2023.01.16

느리게가는시계